WO2005089403A2 - Methods and apparatus for navigating an image - Google Patents
Methods and apparatus for navigating an image Download PDFInfo
- Publication number
- WO2005089403A2 WO2005089403A2 PCT/US2005/008812 US2005008812W WO2005089403A2 WO 2005089403 A2 WO2005089403 A2 WO 2005089403A2 US 2005008812 W US2005008812 W US 2005008812W WO 2005089403 A2 WO2005089403 A2 WO 2005089403A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- elements
- zoom level
- image
- linear size
- pixel
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 178
- 230000009471 action Effects 0.000 claims abstract description 24
- 238000002156 mixing Methods 0.000 claims description 53
- 238000003860 storage Methods 0.000 claims description 15
- 238000012545 processing Methods 0.000 claims description 8
- 238000004891 communication Methods 0.000 claims description 3
- 230000000007 visual effect Effects 0.000 description 76
- 239000002131 composite material Substances 0.000 description 45
- 238000009877 rendering Methods 0.000 description 45
- 238000004422 calculation algorithm Methods 0.000 description 40
- 238000013459 approach Methods 0.000 description 28
- 238000004091 panning Methods 0.000 description 17
- 230000008569 process Effects 0.000 description 16
- 230000008901 benefit Effects 0.000 description 14
- 238000005516 engineering process Methods 0.000 description 14
- 230000008859 change Effects 0.000 description 13
- 230000002452 interceptive effect Effects 0.000 description 12
- 230000033001 locomotion Effects 0.000 description 11
- 230000002123 temporal effect Effects 0.000 description 11
- 230000007704 transition Effects 0.000 description 11
- 238000004883 computer application Methods 0.000 description 10
- 230000006870 function Effects 0.000 description 10
- 238000012952 Resampling Methods 0.000 description 9
- 230000007423 decrease Effects 0.000 description 9
- 230000000694 effects Effects 0.000 description 9
- 238000013507 mapping Methods 0.000 description 9
- 230000005540 biological transmission Effects 0.000 description 8
- 230000015556 catabolic process Effects 0.000 description 8
- 238000006731 degradation reaction Methods 0.000 description 8
- 239000000872 buffer Substances 0.000 description 7
- 230000003993 interaction Effects 0.000 description 7
- 230000000750 progressive effect Effects 0.000 description 7
- 230000004044 response Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 230000006403 short-term memory Effects 0.000 description 6
- 230000015654 memory Effects 0.000 description 5
- 239000000203 mixture Substances 0.000 description 5
- 230000003068 static effect Effects 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 238000000354 decomposition reaction Methods 0.000 description 4
- 238000011161 development Methods 0.000 description 4
- 230000018109 developmental process Effects 0.000 description 4
- 238000005562 fading Methods 0.000 description 4
- 230000000670 limiting effect Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 230000036961 partial effect Effects 0.000 description 4
- 229920001690 polydopamine Polymers 0.000 description 4
- 238000005070 sampling Methods 0.000 description 4
- 238000010276 construction Methods 0.000 description 3
- 230000003247 decreasing effect Effects 0.000 description 3
- 230000008034 disappearance Effects 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 239000003973 paint Substances 0.000 description 3
- 230000008447 perception Effects 0.000 description 3
- 230000002093 peripheral effect Effects 0.000 description 3
- 230000004043 responsiveness Effects 0.000 description 3
- 239000007787 solid Substances 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 230000001052 transient effect Effects 0.000 description 3
- 238000012935 Averaging Methods 0.000 description 2
- 239000008186 active pharmaceutical agent Substances 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 239000011521 glass Substances 0.000 description 2
- 230000001788 irregular Effects 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000012913 prioritisation Methods 0.000 description 2
- 230000000135 prohibitive effect Effects 0.000 description 2
- 238000012552 review Methods 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 238000012876 topography Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- 241000370685 Arge Species 0.000 description 1
- 241001235534 Graphis <ascomycete fungus> Species 0.000 description 1
- 241000282376 Panthera tigris Species 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 210000003484 anatomy Anatomy 0.000 description 1
- 238000010420 art technique Methods 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 238000009125 cardiac resynchronization therapy Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000001427 coherent effect Effects 0.000 description 1
- 230000002860 competitive effect Effects 0.000 description 1
- 230000001010 compromised effect Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000002059 diagnostic imaging Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000013213 extrapolation Methods 0.000 description 1
- 230000004438 eyesight Effects 0.000 description 1
- 235000013410 fast food Nutrition 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000001771 impaired effect Effects 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 238000012886 linear function Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000000873 masking effect Effects 0.000 description 1
- 210000000653 nervous system Anatomy 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 230000005043 peripheral vision Effects 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 238000007430 reference method Methods 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 238000011105 stabilization Methods 0.000 description 1
- 230000005477 standard model Effects 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 238000013316 zoning Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0484—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
- G06F3/04845—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range for image manipulation, e.g. dragging, rotation, expansion or change of colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B29/00—Maps; Plans; Charts; Diagrams, e.g. route diagram
- G09B29/003—Maps
- G09B29/006—Representation of non-cartographic information on maps, e.g. population distribution, wind direction, radiation levels, air and sea routes
- G09B29/007—Representation of non-cartographic information on maps, e.g. population distribution, wind direction, radiation levels, air and sea routes using computer methods
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B29/00—Maps; Plans; Charts; Diagrams, e.g. route diagram
- G09B29/10—Map spot or coordinate position indicators; Map reading aids
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2203/00—Indexing scheme relating to G06F3/00 - G06F3/048
- G06F2203/048—Indexing scheme relating to G06F3/048
- G06F2203/04806—Zoom, i.e. interaction techniques or interactors for controlling the zooming operation
Definitions
- the present invention relates to methods and apparatus for navigating, such as zooming and panning, over an image of an object in such a way as to provide the appearance of smooth, continuous navigational movement.
- GUIs graphical computer user interfaces
- visual components may be represented and manipulated such that they do not have a fixed spatial scale on the display; indeed, the visual components may be panned and/or zoomed in or out.
- the ability to zoom in and out on an image is desirable in connection with, for example, viewing maps, browsing through text layouts such as newspapers, viewing digital photographs, viewing blueprints or diagrams, and viewing other large data sets.
- zoomable components are a peripheral aspect of a user's interaction with the software and the zooming feature is only employed occasionally.
- These computer applications permit a user to pan over an image smoothly and continuously (e.g., utilizing scroll bars or the cursor to translate the viewed image left, right, up or down) .
- a significant problem with such computer applications is that they do not permit a user to zoom smoothly and continuously. Indeed, they provide zooming in discrete steps, such as 10%, 25%, 50%, 75%, 100%, 150%, 200%, 500%, etc.
- the user selects the desired zoom using the cursor and, in response, the image changes abruptly to the selected zoom level.
- the undesirable qualities of discontinuous zooming also exist in Internet-based computer applications.
- the computer application underlying the www.mapguest . com website illustrates this point.
- the MapQuest website permits a user to enter one or more addresses and receive an image of a roadmap £ ⁇ v response.
- FIGS. 1-4 are examples of images that one may obtain from the MapQuest website in response to a query for a regional map of Long Island, NY, U.S.A.
- the MapQuest website permits the user to zoom in and zoom out to discrete levels, such as 10 levels.
- FIG. 1 is a rendition at zoom level 5, which is approximately 100 meters/pixel.
- FIG. 2 is an image at a zoom level 6, which is about 35 meters/pixel.
- FIG. 3 is an image at a zoom level 7, which is about 20 meters/pixel.
- FIG. 4 is an image at a zoom level 9, which is about 10 meters/pixel.
- the abrupt transitions between zoom levels result in a sudden and abrupt loss of detail when zooming out and a sudden and abrupt addition of detail when zooming in.
- no local, secondary or connecting roads may be seen in FIG. 1 (at zoom level 5) , although secondary and connecting roads suddenly appear in FIG. 2, which is the very next zoom level.
- Such abrupt discontinuities are very displeasing when utilizing the MapQuest website.
- Al roads may be about 16 meters wide, A2 roads may be abo ⁇ t 12 meters wide, A3 roads may be about 8 meters wide, A4 roads may be about 5 meters wide, and A5 roads may be about 2.5 meters wide.
- the MapQuest computer application deals with these varying levels of coarseness by displaying only the road categories deemed appropriate at a particular zoom level. For example, a nation-wide view might only show Al roads, while a state-wide view might show Al and A2 roads, and a county-wide view might show Al, A2 and A3 roads . Even if MapQuest were modified to allow continuous zooming of the roadmap, this approach would lead to the sudden appearance and disappearance of road categories during zooming, which is confusing and visually displeasing. In view of the foregoing, there are needs in the art for new methods and apparatus for navigating images of complex objects, which permit smooth and continuous zooming of the image while also preserving visual distinctions between the elements of the objects based on their size or importance.
- methods and apparatus are contemplated to perform various actions, including: zooming into or out of an image having at least one object, wherein at least some elements of at least one object are scaled up and/or down in a way that is non-physically proportional to one or more zoom levels associated with the zooming.
- ⁇ the scale power a ⁇ is not equal to -1 (typically -1 ⁇ a ⁇ 0) within a range of zoom levels -..0 and zl, where zO is of a lower physical linear size/pixel than zl .
- At least one of zO and zl may vary for one or more elements of the object. It is noted that a, c and d may also vary from element to element . At least some elements of the at least one obj ect may also be scaled up and/or down in a way that is physically proportional to one or more zoom levels associated with the zooming.
- the invention may also be embodied in a software program for storage in a suitable storage medium and execution by a processing unit .
- the elements of the obj ect may be of varying degrees of coarseness .
- the coarseness of the elements of a roadmap object manifests because there are considerably more A4 roads than A3 roads, there are considerably more A3 roads than A2 roads, and there are considerably more A2 roads than Al roads.
- Degree of coarseness in road categories also manifests in such properties as average road length, frequency of intersections , and maximum curvature .
- the coarseness of the elements of other image objects may manifest in other ways too numerous to list in their entirety.
- the scaling of the elements in a given predetermined, image may be physically proportional or non-physically proportional based on at least one of : (i) a degree of coarseness of such elements ; and (ii) the zoom level of the given predetermined image.
- the object may be a roadmap
- the elements of the object may be roads
- the varying degrees of coarseness may be road hierarchies.
- the scaling of a given road in a given predetermined image may be physically proportional or non-physically proportional based on: (i) the road hierarchy of the given road; and (ii) the zoom level of the given predetermined image.
- methods and apparatus are contemplated to perform various actions, including: receiving at a client terminal a plurality of pre-rendered images of varying zoom levels of a roadmap; receiving one or more user navigation commands including zooming information at the client terminal; and blending two or more of the pre-rendered images to obtain an intermediate image of an intermediate zoom level that corresponds with the zooming information of the navigation commands such that a display of the intermediate image on the client terminal provides the appearance of smooth navigation.
- methods and apparatus are contemplated to perform various actions, including: receiving at a client terminal a plurality of pre-rendered images of varying zoom levels of at least one object, at least some elements of the at least one object being scaled up and/or down in order to produce the plurality of pre-determined images, and the scaling being at least one of: (i) physically proportional to the zoom level; and (ii) non-physically proportional to the zoom level; receiving one or more user navigation commands including zooming information at the client terminal; blending two or more of the pre-rendered images to obtain an intermediate image of an intermediate zoom level that corresponds with the zooming information of the navigation commands; and displaying the intermediate image on the client terminal.
- methods and apparatus are contemplated to perform various actions, including: transmitting a plurality of pre-rendered images of varying zoom levels of a roadmap to a client tfc-.gov; J .nal o ⁇ esc a rommunications channel; receiving the plurality of pre—rendered images at thc ; ' client terminal; issuing one or more user navigation commands inclu ⁇ ing zooming information using the client terminal; and blending two or more of the pre-rendered images to obtain an intermediate image of an intermediate zoom level that corresponds with the zooming information of the navigation commands such that a display of the intermediate image on the client terminal provides the appearance of smooth navigation.
- methods and apparatus are contemplated to perform various actions, including: transmitting a plurality of pre-rendered images of varying zoom levels of at least one object to a client terminal over a communications channel, at least some elements of the at least one object being scaled up and/or down in order to produce the plurality of pire-determined images, and the scaling being at least one of: (i) physically proportional to the zoom level; and (ii) non-physically proportional to the zoom level; receiving the plurality of pre-rendered images at the client terminal; issuing one or more user navigation commands including zooming information using the client terminal; blending two of the pre-rendered images to obtain an intermediate image of an intermediate zoom level that corresponds with the zooming information of the navigation commands; and displaying the intermediate image on the client terminal .
- FIG. 1 is an image taken from the MapQuest website, which is at a zoom level 5;
- FIG. 2 is an image taken from the M'ap.Qe.es website, which is at a zoom level 6;
- FIG. 3 is an image taken from the MapQuest website, which is at a zoom level 7;
- FIG. 4 is an image taken from the MapQuest website, which is at a zoom level 9;
- FIG. 5 is an image of Long Island produced at a zoom level of about 334 meters/pixel in accordance with one or more aspects of the present invention;
- FIG. 6 is an image of Long Island produced at a zoom level of about 191 meters/pixel in accordance with one or more further aspects of the present invention
- FIG. 7 is an image of Long Island produced at a zoom level of about 109.2 meters/pixel in accordance with one or more further aspects of the present invention
- FIG. 8 is an image of Long Island produced at a zoom level of about 62.4 meters/pixel in accordance with one or more further aspects of the present invention
- FIG. 9 is an image of Long Island produced at a zoom level of about 35.7 meters/pixel in accordance with one or more further aspects of the present invention
- FIG. 10 is an image of Long Island produced at a zoom level of about 20.4 meters/pixel in accordance with one or more further aspects of the present invention
- FIG. 11 is an image of Long Island produced at a zoom level of about 11.7 meters/pixel in accordance with one or more further aspects of the present invention
- FIG. 12 is a flow diagram illustrating process steps that may be carried out in order to provide smooth and continuous navigation of an image in accordance with one or more aspects of the present invention
- FIG. 13 is a flow diagram illustrating further process steps that may be carried out in order to smoothly navigate an image in accordance with various aspects of the present invention
- FIG. 14 is a log-log graph of a line width in pixels versus a zoom level in meters/pixel illustrating physical and non-physical scalinef in accordance with one or. r ⁇ a ⁇ r further aspects of the present inveafe sa; arrcf FIG.
- FIG. 15 is a log-log graph illustrating variations in the physical and non-physical scaling of FIG. 14.
- FIGS. 16A-D illustrate respective antialiased vertical lines whose endpoints are precisely centered on pixel coordinates;
- FIGS. 17A-C illustrate respective antialiased lines on a slant, with endpoints not positioned to fall at exact pixel coordinates;
- FIG. 18 is the log-log graph of line width versus zoom level of FIG. 14 including horizontal lines indicating incremental line widths, and vertical lines spaced such that the line width over the interval between two adjacent vertical lines changes by no more than two pixels.
- FIGS. 5-11 a series of images representing the road system of Long Island, NY, U.S.A. where each image is at a different zoom level (or resolution) .
- zoom level or resolution
- the extent of images and implementations for which the present invention may be employed are too numerous to list in their entirety.
- the features of the present invention may be used to navigate images of the human anatomy, complex topographies, engineering diagrams such as wiring diagrams or blueprints, gene ontologies, etc. It has been found, however, that the invention has particular applicability to navigating images in which the elements thereof are of varying levels of detail or coarseness. Therefore, for the purposes of brevity and clarity, the various aspects of -Te present in.vrer.t .Qn: will . be discussed in connection with a specific exam le, namely, images !f a roadmap.
- the image 100A of the roadmap illustrated in FIG. 5 is at a zoom level that may be characterized by units of physical length/pixel (or physical linear size/pixel) .
- the zoom level, z represents the actual physical linear size that a single pixel of the image 100A represents.
- the zoom level is about 334 meters/pixel.
- FIG. 6 is an image 100B of the same roadmap as FIG.
- FIGS. 5-11 Another significant feature of the present invention as illustrated in FIGS. 5-11 is that little or no detail abruptly appears or disappears when zooming from one level to another level.
- the roadmap includes elements (i.e., roads) of varying degrees of coarseness.
- FIG. 8 5 includes at least Al highways such as 102, A3 -secondary roads such as 104, and A4 local roads such as 106. Yet these details, even the A4 local roads 106, may still be seen in image 100A of FIG. 5, which is substantially zoomed out in comparison with the image 100D of FIG. 8.
- the Al, A2, A3, and A4 roads may be distinguished from one another. Even differences between Al primary highways 102 and A2 primary roads 108 may be distinguished from one another vis-a-vis the relative
- FIGS. 12-13 are flow diagrams illustrating process steps that are preferably carried out by the one or more computing devices and/or related equipment.
- the process flow is carried out by commercially available computing equipment (such as Pentium-based computers)
- any of a number of other techniques may be employed to carry out the process steps without departing from the spirit and scope of the present invention as claimed.
- the hardware employed may be implemented utilizing any other known or hereinafter developed technologies, such as standard digital circuitry, analog circuitry, any of the known processors that are operable to execute software and/or firmware programs, one or more programmable digital devices or systems, such as programmable read only memories (PROMs) , programmable array logic devices (PALs) , any combination of the above, etc.
- the methods of the present invention may be embodied in a software program that may be stored on any of the known or hereinafter developed media.
- FIG. 12 illustrates an embodiment of the invention in which a plurality of images are prepared (each at a different zoom level or resolution) , action 200, and two or more of the images are blended together to achieve the appearance of smooth navigation, such as zooming (action 206) .
- a service provider would expend the resources to prepare a plurality of pre-rendered images (action 200) and make the images available to a user's client terminal over a communications channel, such as the Internet (action 202) .
- the pre—rei-Edrred images may be -an integral or related part of an a-jsp ication program that the user loads and executes on his ⁇ _ ⁇ her computer. It has been found through experimentation that, when the blending approach is used, a set of images at the following zoom levels work well when the image object is a roadmap; 30 meters/pixel, 50 meters/pixel, 75 meters/pixel, 100 meters/pixel, 200 meters/pixel, 300 meters/pixel, 500 meters/pixel, 1000 meters/pixel, and 3000 meters/pixel. It is noted, however, that any number of images may be employed at any number of resolutions without departing from the scope of the invention.
- the client terminal in response to user-initiated navigation commands (action 204), such as zooming commands, the client terminal is preferably operable to blend two or more images in order to produce an intermediate resolution image that coincides with the navigation command (action 206) .
- This blending may be accomplished by a number of methods, such as the well—known trilinear interpolation technique described by Lance Williams, Pyramidal Parametrics, Computer Graphics, Proc. SIGGRAPH ⁇ 83, 17(3): 1-11 (1983), the entire disclosure of which is incorporated herein by reference.
- the present invention does not require or depend on any particular one of these blending methods.
- the user may wish to navigate to a zoom level of 62.4 meters/pixel.
- this zoom level may be between two of the pre-rendered images (e.g., in this example between zoom level 50 meters/pixel and zoom level 75 meters/pixel)
- the desired zoom level of 62.4 meters/pixel may be achieved using the trilinear interpolation technique.
- any zoom level between 50 meters/pixel and 75 meters/pixel may be obtained utilizing a blending method as described above, which if per r-s ed quickly enough provides the/ ,,insurance of smot th and continuoiug m d remedyqa.t ⁇ &xi- . '8fe®-.
- I ns& sgr technique may be carried through to other zoom levels, such as the 35.7 meters/pixel level illustrated in FIG. 9.
- the blending technique may be performed as between the pre-rendered images of 30 meters/pixel and 50 meters/pixel of the example discussed thus far.
- the above blending approach may be used when the computing power of the processing unit on which the invention is carried out is not high enough to (i) perform the rendering operation in the first instance, and/or (ii) perform image rendering "just-in-time” or “on the fly” (for example, in real time) to achieve a higti image frame rate for smooth navigation.
- image rendering "just-in-time” or "on the fly” (for example, in real time) to achieve a higti image frame rate for smooth navigation.
- FIG. 13 illustrates the detailed steps and/or actions that are preferably conducted to prepare one or more images in accordance with the present invention.
- the information is obtained regarding the image object or objects using any of the known or hereinafter developed techniques.
- image objects have been modeled using appropriate primitives, such as polygons, lines, points, etc.
- appropriate primitives such as polygons, lines, points, etc.
- UDM Universal Transverse Mercator
- the model is usually in the form of a list of line segments (in any coordinate system) that comprise the roads in the zone.
- the list may be converted into an image in the spatial domain (a pixel image) using any of the known or hereinafter developed rendering processes so long as it incorporates certain techniques for determining the weight (e.g., apparent or real thickness) of a given primitive in the pixel (spatial) domain.
- the rendering processes should incorporate certain techniques for determining the weight of the lines that model the roads of the roadmap in the spatial domain. These techniques will be discussed below.
- the elements of ' he object are classified.
- the classifica.fci-on may take the form of recognizing already existing categories, namely, Al, A2, A3, A4, and A5. Indeed, these road elements have varying degrees of coarseness and, as will be discussed below, may be rendered differently based on this classification.
- mathematical scaling is applied to the different road elements based on the zoom level. As will be discussed in more detail below, the mathematical scaling may also vary based on the element classification.
- the pre-set pixel width approach dictates that every road is a certain pixel width, such as one pixel in width on the display.
- Major roads such as highways, may be emphasized by making them two pixels wide, etc.
- this approach makes the visual density of the map change as one zooms in and out. At some level of zoom, the result might be pleasing, e.g., at a small-size county level. As one zooms in, however, roads would not thicken, making the map look overly sparse. Further, as one zooms out, roads would run into each other, rapidly forming a solid nest in which individual roads would be indistinguishable.
- the images are produced in such a way that at least some image elements are scaled up and/or down either (i) physically proportional to the zoom level; or (ii) i non-physically J -_ ⁇ ortional « to hat zoo level, depending on parameters that will be c-xscussed in more detail below. It is noted th ' ⁇ . - the scaling being "physically proportional to the zoom level" means that the number of pixels representing the road width increases or decreases with the zoom level as the 10 size of an element would appear to change with its distance from the human eye.
- zooming in is equivalent to moving an object closer to the viewer, and zooming out is equivalent to moving the object farther away.
- a may be set to a power law other than -1
- d' may be set to a physical linear size i0 other than the actual physical linear size d.
- non-physically proportional to the zoom level means that the road width in display pixels increases or decreases with the zoom 5 level in a way other than being physically proportional to the zoom level, i.e. a ⁇ -1.
- the scaling is distorted in a way that achieves certain desirable results.
- linear size means one-dimensional size. For example, if one considers any 2 dimensional object and doubles
- the linear sizes X. die elements f an object may u ⁇ oT've l ngtii,- m: ttlliy, ⁇ ius, diameter, and/or any other measurement that one can read off with a ruler on the Euclidean plane.
- the thickness of a line, the len-gth of a line, the diameter of a circle or disc, the length of one side of a polygon, and the distance between two points are a-11 examples of linear sizes. In this sense the "linear size" in two dimensions is the distance between two identified points of an object on a 2D Euclidean plane.
- a ⁇ 0 will cause the rendered size of an element to decrease as one zooms out, and increase as one zooms in.
- the rendered size of the element will decrease faster than it would with proportional physical scaling as one zooms out.
- the size of the rendered element decreases more slowly than it would with, proportional physical scaling as one zooms out.
- p(z) for a given length of a given object, is permitted to be substantially continuous so that during navigation the user does not experience a sudden jump or discontinuity in the size of an element of the image (as opposed to the conventional approaches that permit the most extreme discontinuity - a sudclen appearance or disappearance of an element during navigation) .
- p(z) monotonically decrease with zooming out such that zooming out causes the elements of the object become smaller (e.g., roads to become thinner), and such that zooming in causes the elements of the object become larger. This gives the user a sense of physicality about the object (s) of the image.
- the scaling of the road widths may be physically proportional to the zoom level when zoomed in (e.g., up to about 0.5 meters/pixel); (ii) that the scaling of the road widths may be non-physically proportional to the zoom level when zoomed out (e.g., above about 0.5 meters/pixel); and (iii) that the scaling of the road widths may be physically proportional to the zoom level when zoomed further out (e.g., above about 50 meters/pixel or higher depending on parameters which will be discussed in more detail below) .
- a -1.
- zO 0.5 meters/pixel, or 2 pixels/meter, which when expressed as a map scale on a 15 inch display (with 1600x1200 pixel resolution) corresponds to a scale of about 1:2600.
- d 16 meters, which is a reasonable real physical width for Al roads, the rendered road will appear to be its actual size when one is zoomed in (0.5 meters/pixel or less) .
- the rendered line is about 160 pixels wide.
- this permits the Al road to remain visible (and distinguishable from other smaller roads) as one zooms out.
- the width of the rendered line using physical scaling would have been about 0.005 pixels at a zoom level of about 3300 meters/pixel, rendering it virtually invisible.
- the width of the rendered line is about 0.8 pixels at a zoom level of 3300 meters/pixel, rendering it clearly visible.
- the value for zl is chosen to be the most zoomed-out scale at which a given road still has "greater than physical" importance.
- the resolution would be approximately 3300 meters/pixel or 3.3 kilometers/pixel. If one looks at the entire world, then there may be no reason for U.S. highways to assume enhanced importance relative to the view of the country alone.
- the scaling of the road widths is again physically proportional to the zoom level, but preferably with a large d' (much greater than the real width d) for continuity of p(z).
- a new imputed physical width of the Al highway is chosen, for example.,, f ⁇ 1.65 ilometre! s . zl and the new valira.- f ⁇ w d r are : preii ⁇ F£a.biy chosen in such a way that, at the outer scale zl, the rendered width of the line will be a reasonable number of pixels.
- Al roads may be about pixel wide, which is thin but still clearly visible; this corresponds to an imputed physical road width of 1650 meters, or 1.65 kilometers.
- p(z) has six parameters: zO, zl, dO, dl, d2 and a.
- zO and zl mark the scales at which the behavior of ⁇ (z) changes.
- zooming is physical (i.e., the exponent of z is -1), with a physical width off dO, which preferably corresponds to the real physical width d.
- zooming is again physical, but with a physical width of dl, which in general does not correspond to d.
- the rendered line width scales with a powe r law of a, which can be a value other than -1.
- a powe r law
- dO 8 meters
- zO 0.5 meters/pixel
- zl 50 meters/pixel
- d2 100 meters.
- the dotted lines all have a slope of -1 and represent physical scaling at different physical widths. From the top down, the corresponding physical widths of these dotted lines are: 1.65 kilometers, 312 meters, 100 meters, 20 meters, 16 meters, 12 meters, 8 meters, 5 meters, and 2.5 meters.
- interpolation between a plurality of pre-rendered images it is possible in many cases to ensure that the resulting interpolation is humanly indistinguishable or nearly indistinguishable from an ideal rendition of all lines or other primitive geometric elements at their correct pixel widths as determined by the physical and non-physical scaling equations.
- this approach is designed to ensure that the line integral of the intensity function (or "1-intensity" function, for black lines on a white background) over a perpendicular to the line drawn is equal to the line width.
- This method generalizes readily to lines whose endpoints do not lie precisely in the centers of pixels, to lines which are in other orientations than vertical, and to curves. Note that drawing the antialiased vertical lines of FIGS.
- 16A-D could also be accomplished by alpha-blending two images, one (image A) in which the line is 1 pixel wide, and the other (image
- FIGS. 17A-C a 1 pixel wide line (FIG. 17A) , a 2 pixel wide line (FIG. 17B) and a 3 pixel wide line (FIG. 17C) are illustrated in an arbitrary orientation.
- the same principle applies to the arbitrary orientation of FIGS. 17A-C as to the case where the lines are aligned exactly to the pixel grid, although the spacing of the line widths between which to alpha-blend may need to be finer than two pixels for good results.
- FIG. 18 is substantially similar to FIG. 14 except that FIG. 18 includes a set of horizontal lines and vertical lines.
- the horizontal lines indicate line widths between 1 and 10 pixels, in increments of one pixel.
- the vertical lines are spaced such that line width over the interval between two adjacent vertical lines changes by no more than two pixels.
- the vertical lines represent a set of zoom values suitable for pre-rendition, wherein alpha-blending between two adjacent such pre-rendered images will produce characteristics nearly equivalent to rendering the lines representing roads at continuously variable widths.
- the present invention may be employed by an Internet website that provides maps and driving directions to client terminals in response to user requests.
- various aspects of the invention may be employed in a GPS navigation system in an automobile.
- the invention may also be incorporated into medical imaging equipment, whereby detailed information concerning, for example, a patient's circulatory system, nervous system, etc. may be rendered and navigated as discussed hereinabove.
- the applications of the invention are too numerous to list in their entirety, yet a skilled artisan will recognize that they are contemplated herein and fall within the scope of the invention as claimed.
- the present invention may also be utilized in connection with other applications in which the rendered images provide a means for advertising and otherwise advancing commerce. Additional details concerning these aspects and uses of the present invention may be found in U.S.
- the present invention relates generally to graphical zooming user interfaces (ZUI) for computers. More specifically, the invention is a system and method for progressively rendering zoomable visual content in a manner that is both computationally efficient, resulting in good user responsiveness and interactive frame rates, and exact, in the sense that vector drawings, text, and other non-photographic content is ultimately drawn without the resampling which would normally lead to degradation in image quality, and without interpolation of other images, which would also lead to degradation.
- ZUI graphical zooming user interfaces
- GUIs graphical computer user interfaces
- visual components could be represented and manipulated in such a way that they do not have a fixed spatial scale on the display, but can be zoomed in or out.
- the desirability of zoomable components is obvious in many application domains; to name only a few: viewing maps, browsing through large heterogeneous text layouts such as newspapers, viewing albums of digital photographs, and working with visualizations of large data sets.
- viewing maps browsing through large heterogeneous text layouts such as newspapers, viewing albums of digital photographs, and working with visualizations of large data sets.
- Even when viewing ordinary documents, such as spreadsheets and reports it is often useful to be able to glance at a document overview, and then zoom in on an area of interest.
- zoomable components such as Microsoft® Word ® and other Office ® products (Zoom under the View menu), Adobe ® Photoshop ®, Adobe ® Acrobat ®, QuarkXPress ®, etc.
- these applications allow zooming in and out of documents, but not necessarily zooming in and out ofthe visual components ofthe applications themselves. Further, zooming is normally a peripheral aspect ofthe user's interaction with the software, and the zoom setting is only modified occasionally.
- continuous panning over a document is standard (i.e., using scrollbars or the cursor to translate the viewed document left, right, up or down), the ability to zoom and pan continuously in a user-friendly manner is absent from prior art systems.
- a display is the device or devices used to output rendered imagery to the user.
- a frame buffer is used to dynamically represent the contents of at least a portion ofthe display.
- Display refresh rate is the rate at which the physical display, or portion thereof, is refreshed using the contents ofthe frame buffer.
- a frame buffer's frame rate is the rate at which the frame buffer is updated.
- the display refresh rate is 60- 90 Hz.
- Most digital video for example, has a frame rate of 24-30 Hz.
- each frame of digital video will actually be displayed at least twice as the display is refreshed.
- Plural frame buffers may be utilized at different frame rates and thus be displayed substantially simultaneously on the same display. This would occur, for example, when two digital videos with different frame rates were being played on the same display, in different windows.
- ZUI zooming user interfaces
- LOD pyramid The complete set of LODs, organized conceptually as a stack of images of decreasing resolution, is termed the LOD pyramid — see Fig. 1.
- LOD pyramid The complete set of LODs, organized conceptually as a stack of images of decreasing resolution, is termed the LOD pyramid — see Fig. 1.
- the system interpolates between the LODs and displays a resulting image at a desired resolution. While this approach solves the computational issue, it displays a final compromised image that is often blurred and unrealistic, and often involves loss of information due to the fact that it represents interpolation of different LODs. These interpolation errors are especially noticeable when the user stops zooming and has the opportunity to view a still image at a chosen resolution which does not precisely match the resolution of any ofthe LODs.
- vector data typically treats vector data in the same way as photographic or image data.
- Vector data such as blueprints or line drawings, are displayed by processing a set of abstract instructions using a rendering algorithm, which can render lines, curves and other primitive shapes at any desired resolution.
- Text rendered using scalable fonts is an important special case of vector data.
- Image or photographic data (including text rendered using bitmapped fonts) are not so generated, but must be displayed either by interpolation between precomputed LODs or by resampling an original image. We refer to the latter herein as nonvector data.
- a further object ofthe present invention is to allow the user to zoom arbitrarily far in on vector content while maintaining a crisp, unblurred view ofthe content and maintaining interactive frame rates.
- a further object ofthe present invention is to allow the user to zoom arbitrarily far out to get an overview of complex vectorial content, while both preserving the overall appearance ofthe content and maintaining interactive frame rates.
- a further object ofthe present invention is to diminish the user's perception of transitions between LODs or rendition qualities during interaction.
- a further object ofthe present invention is to allow the graceful degradation of image quality by blurring when information ordinarily needed to render portions ofthe image is as yet incomplete.
- a further object ofthe present invention is to gradually increase image quality by bringing it into sharper focus as more complete information needed to render portions ofthe image becomes available.
- the desired resolution is either greater than the resolution ofthe LOD with the highest available resolution or less than the resolution ofthe LOD with the lowest resolution, then there will be only a single "surrounding LOD".
- the dynamic interpolation of an image at a desired resolution based on a set of precomputed LODs is termed in the literature mipmapping or trilinear interpolation. The latter term further indicates that bilinear sampling is used to resample the surrounding LODs, followed by linear interpolation between these resampled LODs (hence trilinear). See, e.g.; Lance Williams. "Pyramidal Parametrics," Computer Graphics (Proc. SIGGRAPH '83) 17(3): 1-11 (1983).
- the final image is then displayed by preferably first displaying an intermediate final image.
- the intermediate final image is the first image displayed at the desired resolution before that image is refined as described hereafter.
- the intermediate final image may correspond to the image that would be displayed at the desired resolution using the prior art.
- the transition from the intermediate final image to the final image may be gradual, as explained in more detail below.
- the present invention allows LODs to be spaced in any resolution increments, including irrational increments (i.e. magnification or minification factors between consecutive LODs which cannot be expressed as the ratio of two integers), as explained in more detail below.
- irrational increments i.e. magnification or minification factors between consecutive LODs which cannot be expressed as the ratio of two integers
- portions ofthe image at each different LOD are denoted tiles, and such tiles are rendered in an order that minimizes any perceived imperfections to a viewer.
- the displayed visual content is made up of plural LODs (potentially a superset ofthe surrounding LODs as described above), each of which is displayed in the proper proportion and location in order to cause the display to gradually fade into the final image in a manner that conceals imperfections.
- the present invention involves a hybrid strategy, in which an image is displayed using predefined LODs during rapid zooming and panning, but when the view stabilizes sufficiently, an exact LOD is rendered and displayed.
- the exact LOD is rendered and displayed at the precise resolution chosen by the user, which is normally different from the predefined LODs. Because the human visual system is insensitive to fine detail in the visual content while it is still in motion, this hybrid strategy can produce the illusion of continuous "perfect rendering" with far less computation.
- Figure 1 depicts an LOD pyramid (in this case the base of the pyramid, representing the highest-resolution representation, is a 512x512 sample image, and successive minifications of this image are shown in factors of 2);
- Figure 2 depicts a flow chart for use in an exemplary embodiment ofthe invention
- Figure 3 is another flow chart that shows how the system displays the final image after zooming
- Figure 4 is the LOD pyramid of Figure 1 with grid lines added showing the subdivision of each LOD into rectangular tiles of equal size in samples;
- Figure 5 is another flow chart, for use in connection with the present invention, and it depicts a process for displaying rendered tiles on a display
- Figure 6 shows a concept termed irrational tiling, explained in more detail herein;
- Figure 7 depicts a composite tile and the tiles that make up the composite tile, as explained more fully below.
- Figure 2 shows a flowchart of a basic technique for implementation of the present invention.
- the flowchart of Figure 2 represents an exemplary embodiment of the invention and would begin executing when an image is displayed at an initial resolution.
- the invention may be used in the client server model, but that the client and server may be on the same or different machines.
- the actual hardware platform and system utilized are not critical to the present invention.
- the flowchart is entered at start block 201 with an initial view of an image at a particular resolution. In this example, the image is taken to be static.
- the image is displayed at block 202.
- a user may navigate that image by moving, for example, a computer mouse.
- the initial view displayed at block 202 will change when the user navigates the image.
- the underlying image may itself be dynamic, such as in the case of motion video, however, for purposes of this example, the image itself is treated as static.
- any image to be displayed may also have textual or other vector data and/or nonvector data such as photographs and other images.
- the present invention, and the entire discussion below, is applicable regardless of whether the image comprises vector or nonvector data, or both.
- the method transfers control to decision point 203 at which navigation input may be detected.
- Decision point 203 may be implemented by a continuous loop in software looking for a particular signal that detects movement, an interrupt system in hardware, or any other desired methodology.
- the particular technique utilized to detect and analyze the navigation request is not critical to the present invention. Regardless of the methodology used, the system can detect the request, thus indicating a desire to navigate the image.
- Such transformations may include, for example, three dimensional translation and rotation, application of an image filter, local stretching, dynamic spatial distortion applied to selected areas of the image, or any other kind of distortion that might reveal more information.
- Another example would be a virtual magnifying glass, that can get moved over the image and which magnifies parts of the image under the virtual magnifying glass.
- the selected LODs may be those two LODs that "surround" the desired resolution; i.e.; the resolution of the new view.
- the interpolation in prior systems, constantly occurs as the user zooms and is thus often implemented directly in the hardware to achieve speed.
- the combination of detection of movement in decision point 205 and a substantially immediate display of an appropriate inte ⁇ olated image at block 204 results in the image appearing to zoom continuously as the user navigates. During zooming in or out, since the image is moving, an interpolated image is sufficient to look realistic and clear. Any interpolation error is only minimally detectable by the human visual system, as such errors are disguised by the constantly changing view ofthe image. [0037]
- the system tests whether or not the movement has substantially ceased.
- the methodology ascertains whether or not the user has arrived at the point where he has finished zooming.
- control is transferred to block 206, where an exact image is rendered, after which control returns to block 203.
- the system will eventually display an exact LOD.
- the display is not simply rendered and displayed by an interpolation of two predefined LODs, but may be rendered and displayed by re- rendering vector data using the original algorithm used to render the text or other vector data when the initial view was displayed at block 202.
- Nonvector data may also be resampled for rendering and displayed at the exact required LOD.
- the required re- rendering or resampling may be performed not only at the precise resolution required for display at the desired resolution, but also on a sampling grid corresponding precisely to the correct positions ofthe display pixels relative to the underlying content, as calculated based on the desired view.
- translation of the image on the display by ' a pixel in the display plane does not change the required resolution, but it does alter the sampling grid, and therefore requires re-rendering or resampling ofthe exact LOD.
- the foregoing system of Fig. 2 represents a hybrid approach in which interpolation based upon predefined LODs is utilized while the view is changing (e.g.
- the term render refers to the generation by the computer of a tile at a specific LOD based upon vector or nonvector data. With respect to nonvector data, these may be rerendered at an arbitrary resolution by resampling an original image at higher or lower resolution.
- nonvector data these may be rerendered at an arbitrary resolution by resampling an original image at higher or lower resolution.
- this interpolated image may be temporarily displayed after the navigation ceases the intermediate final image, or simply an intermediate image.
- This image is generated from an interpolation ofthe surrounding LODs.
- the intermediate image may be interpolated from more than two discrete LODs, or from two discrete LODs other than the ones that surround the desired resolution.
- block 304 is entered, which causes the image to begin to gradually fade towards an exact rendition of the image, which we term the final image.
- the final image differs from the intermediate image in that the final image may not involve interpolation of any predefined LODs. Instead, the final image, or portions thereof, may comprise newly rendered tiles.
- the newly rendered tiles may result from resampling the original data, and in the case of vector data, the newly rendered tiles may result from rasterization at the desired resolution.
- step 304 is executed so the changeover from the intermediate final image to the final image is done gradually and smoothly. This gradual fading, sometimes called blending, causes the image to come into focus gradually when navigation ceases, producing an effect similar to automatic focusing in cameras or other optical instruments. The illusion of physicality created by this effect is an important aspect ofthe present invention.
- a first LOD may take a 1 inch by 1 inch area of a viewable object and generate a single 32 by 32 sample tile.
- the information may also be rendered by taking the same 1 inch by 1 inch area and representing it as a tile that is 64 by 64 samples, and therefore at a higher resolution.
- irrational tiling Tiling granularity, which we will write as the variable g, is defined as the ratio ofthe linear tiling grid size at a higher- resolution LOD to the linear tiling grid size at the next lower-resolution LOD.
- g 2
- This same value of g has been used in other prior art.
- LODs may be subdivided into tiles in any fashion, in an exemplary embodiment each LOD is subdivided into a grid of square or rectangular tiles containing a constant number of samples (except, as required, at the edges of the visual content).
- zooming in on any point will therefore produce a quasi-random stream of requests for 1, 2 or 4 tiles, and performance will be on average uniform when zooming in everywhere.
- irrational tiling emerges in connection with panning after a deep zoom. When the user pans the image after having zoomed in deeply, at some point a grid line will be moved onto the display.
- Figure 6(b) illustrates the advantage gained by irrational tiling granularity.
- Figure 6 shows cross-sections through several LODs of the visual content; each bar represents a cross-section of a rectangular tile.
- the curves 601, drawn from top to bottom represent the bounds ofthe visible area ofthe visual content at the relevant LOD during a zooming operation: as the resolution is increased (zooming in to reveal more detail), the area under examination decreases.
- Darker bars (e.g., 602) represent tiles which have already been rendered over the course ofthe zoom.
- An important aspect of the invention is the order in which the tiles are rendered. More particularly, the various tiles ofthe various LODs are optimally rendered such that all visible tiles are rendered first. Nonvisible tiles may not be rendered at all. Within the set of visible tiles, rendition proceeds in order of increasing resolution, so that tiles within low-resolution LODs are rendered first.
- tiles are rendered in order of increasing distance from the center ofthe display, which we refer to as foveated rendering.
- many sorting algorithms such as heapsort, quicksort, or others may be used.
- a lexigraphic key may be used for sorting "requests" to render tiles, such that the outer subkey is visibility, the middle subkey is resolution in samples per physical unit, and the inner subkey is distance to the center ofthe display.
- Other methods for ordering tile rendering requests may also be used.
- the actual rendering ofthe tiles optimally takes place as a parallel process with the navigation and display described herein. When rendering and navigation/display proceed as parallel processes, user responsiveness may remain high even when tiles are slow to render.
- a tile represents vector data, such as alphabetic typography in a stroke based font
- rendering ofthe tile would involve running the algorithm to rasterize the alphabetic data and possibly transmitting that data to a client from a server.
- the data fed to the rasterization algorithm could be sent to the client, and the client could run the algorithm to rasterize the tile.
- rendering of a tile involving digitally sampled photographic data could involve resampling of that data to generate the tile at the appropriate LOD. For discrete LODs that are prestored, rendering may involve no more than simply transmitting the tile to a client computer for subsequent display.
- the actual display may comprise different mixes of different tiles from different LODs.
- any portion of the display could contain for example, 20% from LOD 1, 40% from LOD 2, and 40% from LOD 3.
- the algorithm attempts to render tiles from the various LODs in a priority order best suited to supply the rendered tiles for display as they are most needed.
- the actual display ofthe rendered tiles will be explained in more detail later with reference to Figure 5.
- the algorithm is designed to make the best use of all rendered tiles, using high-resolution tiles in preference to lower-resolution tiles covering the same display area, yet using spatial blending to avoiding sharp boundaries between LODs, and temporally graduated blending weights to blend in higher detail if and when it becomes available (i.e. when higher-resolution tiles have been rendered).
- this algorithm and variants thereof can result in more than two LODs being blended together at a given point on the display; it can also result in blending coefficients that vary smoothly over the display area; and it can result in blending coefficients that evolve in time even after the user has stopped navigating.
- a composite tile area or simply a composite tile.
- To define a composite tile we consider all of the LODs stacked on top of each other. Each LOD has its own tile grid. The composite grid is then formed by the projection of all of the grids from all ofthe LODs onto a single plane. The composite grid is then made up of various composite tiles of different sizes, defined by the boundaries of tiles from all of the different LODs. This is shown conceptually in Fig. 7. Fig. 7 depicts the tiles from three different LODs, 701 through 703, all representing the same image.
- Fig. 7 shows that there would be a single "composite tile" 710.
- the fraune rate may be typically greater than ten frames per second. Note that, as explained above, this frame rate is not necessarily the display refresh rate.
- Fig. 5 depicts a flow chart of an algorithm for updating the frame buffer as tiles are rendered.
- the arrangement of Fig. 5 is intended to op-erate on every composite tile in the displayed image each time the frame buffer is updated.
- a frame duration is 1/20 of a second
- each ofthe composite tiles on the entire screen would preferably be examined and updated during each 1/20 of a se&ond.
- the composite tile may lack the relevant tiles in one or more LODs.
- the process of Fig. 5 attempts to display each composite tile as a weighted average of all the available superimposed tiles within which the composite tile lies. Note that composite tiles are defined in such a way that they fall within exactly one tile at any given LOD; hence the weighted average can be expressed as a relative proportion of each LOD.
- the process attempts to determine the appropriate weights for each LOD within the composite tile, and to vary those weights gradually over space and time to cause the image to gradually fade towards the final images discussed above.
- the composite grid includes plural vertices which are defined to be any intersection or corner of gridlines in the composite grid. These are termed composite grid vertices.
- the current weights at any particular time for each LOD at each vertex are maintained in memory.
- the algorithm for updating vertex weights proceeds as described below.
- Both of these variables are again numbers between 0.0 and 1.0, and are maintained for each vertex in the composite tiling.
- the algorithm walks through each LOD in turn, in order from highest- resolution to lowest, performing the following operations. First 0.0 is assigned to levelOpacityGrid at all vertices. Then, for each rendered tile at that LOD (which may be a subset ofthe set of tiles at that LOD, if some have not yet been rendered), the algorithm updates the parts of the levelOpacityGrid touching that tile based on the tile's centerOpacity, cornerOpacity and edgeOpacity values: [0065] If the vertex is entirely in the interior ofthe tile, then it gets updated using centerOpacity.
- the vertex is e.g. on the tile's left edge, it gets updated with the left edgeOpacity.
- the vertex is e.g. on the top right corner, it gets updated with the top right cornerOpacity.
- "Updating" means the following: if the pre-existing levelOpacityGrid value is greater than 0.0, then set the new value to the minimum of the present value, or the value it's being updated with. If the pre-existing value is zero (i.e. this vertex hasn't been touched yet) then just set the levelOpacityGrid value to the value it's being updated with.
- the levelOpacityGrid at each vertex position gets set to the minimum nonzero value with which it gets updated.
- the algorithm then walks through the levelOpacityGrid and sets to 0.0 any vertices that touch a tile which has not yet been rendered, termed a hole. This ensures spatial continuity of blending: wherever a composite tile falls within a hole, at the current LOD, drawing opacity should fade to zero at all vertices abutting that hole.
- the algorithm can then relax all levelOpacityGrid values to further improve spatial continuity of LOD blending.
- Every vertex is like a tentpole, where the levelOpacityGrid value at that point are the tentpole's height.
- the algorithm has thus far ensured that at all points bordering on a hole, the tentpoles have zero height; and in the interior of tiles that have been rendered, the tentpoles are set to some (probably) nonzero value.
- all the values inside a rendered tile are set to 1.0.
- the border values are 0.0.
- the relax operation smoothes out the tent, always preserving values of 0.0, but possibly lowering other tentpoles to make the function defined by the tent surface smoother, i.e. limiting its maximum spatial derivative. It is immaterial to the invention which of a variety of methods are used to implement this operation; one approach, for example, is to use selective low-pass filtering, locally replacing every nonzero value with a weighted average of its neighbors while leaving zeroes intact. Other methods will also be apparent to those skilled in the art.
- the algorithm then walks over all composite grid vertices, considering corresponding values of levelOpacityGrid and opacityGrid at each vertex: if levelOpacityGrid is greater than 1.0-opacityGrid, then levelOpacityGrid is set to 1.0- opacityGrid. Then, again for each vertex, corresponding values of levelOpacityGrid are added to opacityGrid. Due to the previous step, this can never bring opacityGrid above 1.0. These steps in the algorithm ensure that as much opacity as possible is contributed by higher-resolution LODs when they are available, allowing lower-resolution LODs to "show through" only where there are holes.
- levelOpacityGrid can be multiplied by a scalar overallOpacity variable in the range 0.0 to 1.0 just before drawing; this allows the entire image to be drawn with partial transparency given by the overallOpacity.
- drawing an image-containing polygon, such as a rectangle, with different opacities at each vertex is a standard procedure. It can be accomplished, for example, using industry- standard texture mapping functions using the OpenGL or Direct3D graphics libraries.
- the drawn opacity within the interior of each such polygon is spatially interpolated, resulting in a smooth change in opacity over the polygon.
- tiles maintain not only their current values of centerOpacity, cornerOpacity and edgeOpacity (called the current values), but also a parallel set of values called targetCenterOpacity, targetComerOpacity and targetEdgeOpacity (called the target values).
- the current values are all set to 0.0 when a tile is first rendered, but the the target values are all set to 1.0. Then, after each frame, the current values are adjusted to new values closer to the target values.
- newNalue oldNalue*(l-b) + targetNalue*b, where b is a. rate in greater than 0.0 and less than 1.0.
- a value of b close to 0.0 will result in a very slow transition toward the target value, and a value of b close to 1.0 will result in a very rapid transition toward the target value.
- This method of updating opacities results in exponential convergence toward the target, and results in a visually pleasing impression of temporal continuity.
- Other formulae can achieve the same result.
- the present invention relates generally to zooming user interfaces (ZUIs) for computers. More specifically, the invention is a system and method for progressively rendering arbitrarily large or complex visual content in a zooming environment while maintaining good user responsiveness and high frame rates. Although it is necessary in some situations to temporarily degrade the quality of the rendition to meet these goals, the present invention largely masks this degradation by exploiting well-known properties of the human visual system.
- GUIs graphical computer user interfaces
- visual components could be represented and manipulated in such a way that they do not have a fixed spatial scale on the display, but can be zoomed in or out.
- the desirability of zoomable components is obvious in many application domains; to name only a few: viewing maps, browsing through large heterogeneous text layouts such as newspapers, viewing albums of digital photographs, and working with visualizations of large data sets.
- viewing maps browsing through large heterogeneous text layouts such as newspapers, viewing albums of digital photographs, and working with visualizations of large data sets.
- Even when viewing ordinary documents, such as spreadsheets and reports it is often useful to be able to glance at a document overview, then zoom in on an area of interest.
- zoomable components such as Microsoft® Word ® and other Office ® products (Zoom under the View menu), Adobe ® Photoshop ®, Adobe ® Acrobat ®, QuarkXPress ®, etc.
- these applications allow zooming in and out of documents, but not necessarily zooming in and out ofthe visual components ofthe applications themselves. Further, zooming is normally a peripheral aspect ofthe user's interaction with the software, and the zoom setting is only modified occasionally.
- continuous panning over a document is standard (i.e., using scrollbars or the cursor to translate the viewed document left, right, up or down), the ability to zoom continuously is almost invariably absent.
- any kind of visual content could be zoomed, and zooming would be as much a part ofthe user's experience as panning.
- Ideas along these lines made appearances as futuristic computer user interfaces in many movies even as early as the 1960s 1 ; recent movies continue the trend 2 .
- a number of continuously zooming interfaces have been conceived and/or developed, from the 1970s through the present. 3 In 1991, some of these ideas were formalized in U.S. Patent 5,341 ,466 by Kenneth Perlin and Jacob Schwartz At New York University ("Fractal Computer User Centerface with Zooming Capability").
- the prototype zooming user interface developed by Perlin and co-workers, Pad, and its successor, Pad++, have
- the present invention embodies a novel idea on which a newly developed zooming user interface framework (hereafter referred to by its working name, Noss) is based.
- Noss is more powerful, more responsive, more visually compelling and of more general utility than its predecessors due to a number of innovations in its software architecture.
- This patent is specifically about Noss's approach to object tiling, level-of-detail blending, and render queueing.
- a multiresolution visual object is normally rendered from a discrete set of sampled images at different resolutions or levels of detail (an image pyramid).
- the present invention involves both strategies for prioritizing the (potentially slow) rendition ofthe parts ofthe image pyramid relevent to the current display, and stategies for presenting the user with a smooth, continuous perception ofthe rendered content based on partial information, i.e. only the currently available subset ofthe image pyramid.
- these strategies make near-optimal use ofthe available computing power or bandwidth, while masking, to the extent possible, any image degradation resulting from incomplete image pyramids. Spatial and temporal blending are exploited to avoid discontinuities or sudden changes in image sharpness.
- An objective ofthe present invention is to allow sampled (i.e. "pixellated") visual content to be rendered in a zooming user interface without degradation in ultimate image quality relative to conventional trilinear interpolation.
- a further objective ofthe present invention is to allow arbitrarily large or complex visual content to be viewed in a zooming user interface.
- a further objective of the present invention is to enable near-immediate viewing of arbitrarily complex visual content, even if this content is ultimately represented using a very large amount of data, and even if these data are stored at a remote location and shared over a low-bandwidth network.
- a further objective ofthe present invention is to allow the user to zoom arbitrarily far in on visual content while ma taining interactive frame rates.
- a further objective ofthe present invention is to allow the user to zoom arbitrarily far out to get an overview of complex visual content, in the process both preserving the overall appearance of the content and mamtaining interactive frame rates.
- a further objective ofthe present invention is to mimmize the user's perception of transitions between levels of detail or rendition qualities during interaction.
- a further objective ofthe present invention is to allow the graceful degradation of image quality by continuous blurring when detailed visual content is as yet unavailable, either because the information needed to render it is unavailable, or because rendition is still in progress.
- a further objective ofthe present invention is to gracefully increase image quality by gradual sharpening when renditions of certain parts ofthe visual content first become available.
- zooming user interfaces are a generalization of the usual concepts underlying visual computing, allowing a number of limitations inherent in the classical user/computer/document interaction model to be overcome.
- One such limitation is on the size of a document that can be "opened” from a computer application, as traditionally the entirety of such a document must be “loaded” before viewing or editing can begin.
- RAM random access memory
- this limitation is felt, because all ofthe document information must be transferred to short-term memory from some repository (e.g. from a hard disk, or across a network) during opening; limited bandwidth can thus make the delay between issuing an "open” command and being able to begin viewing or editing unacceptably long.
- Still digital images both provide an excellent example of this problem, and an illustration of how the computer science community has moved beyond the standard model for visual computing in overcoming the problem.
- Table 1 shows download times at different bandwidths for typical compressed sizes of a variety of different image types, from the smallest useful images (thumbnails, which are sometimes used as icons) to the largest in common use today. Shaded boxes indicate images sizes for which interactive browsing is difficult or impossible at a particular connection speed.
- the image is first resized to a hierarchy of resolution scales, usually in factors of two; for example, a 512x512 pixel image is resized to be 256x256 pixels, 128x128, 64x64, 32x32, 16x16, 8x8, 4x4, 2x2, and lxl .
- a 512x512 pixel image is resized to be 256x256 pixels, 128x128, 64x64, 32x32, 16x16, 8x8, 4x4, 2x2, and lxl .
- the fine details are only captured at the higher resolutions, while the broad strokes are captured — using a much smaller amount of information — at the low resolutions. This is why the differently-sized images are often called levels of detail, or LODs for short.
- LODs levels of detail
- a low-resolution image serves as a "predictor" for the next higher resolution.
- This allows the entire image hierarchy to be encoded very efficientl — more efficiently, in fact, than would usually be possible with a non- hierarchical representation ofthe high-resolution image alone. If one imagines that the sequence of multiresolution versions of the image is stored in order of increasing size in the repository, then a natural consequence is that as the image is transferred across the data link to the cache, the user can obtain a low- resolution overview ofthe entire image very rapidly; finer and finer details will then "fill in” as the transmission progresses. This is known as incremental or progressive transmission.
- an image browsing system can be made that is not only capable of viewing images of arbitrarily large size, but is also capable of navigating (i.e. zooming and panning) through such images efficiently at any level of detail.
- Previous models of document access are by nature serial, meaning that the entirety of an information object is transmitted in linear order.
- This model is random-access, meaning that only selected parts ofthe information object are requested, and these requests may be made in any order and over an extended period of time, i.e. over the course of a viewing session.
- the computer and the repository now engage in an extended dialogue, paralleling the user's "dialogue" with the document as viewed on the display.
- each level of detail is the basic unit of transmission.
- the size in pixels of each tile can be kept at or below a constant size, so that each increasing level of detail contains about four times as many tiles as the previous level of detail. Small tiles may occur at the edges ofthe image, as its dimensions may not be an exact multiple ofthe nominal tile size; also, at the lowest levels of detail, the entire image will be smaller than a single nominal tile.
- the resulting tiled image pyramid is shown in Figure 2. Note that the "tip" ofthe pyramid, where the downscaled image is smaller than a single tile, looks like the untiled image pyramid of Figure 1.
- the JPEG2000 image format includes all ofthe features just described for representing tiled, multiresolution and random-access images.
- This includes (but is not limited to) large texts, maps or other vector graphics, spreadsheets, video, and mixed documents such as web pages.
- Our discussion thus far has also implicitly considered a viewing-only application, i.e. one in which only the actions or methods corresponding to opening and drawing need be defined.
- Clearly other methods may be desirable, such as the editing commands implemented by paint programs for static images, the editing commands implemented by word processors for texts, etc.
- SUBSTITUTE SrHEET (RULE 26) is no longer possible if we have zoomed so far in that a single letter fills the entire screen. Hence a zooming user interface may also restrict the action of certain methods to their relevant levels of detail.
- a visual document is not represented internally as an image, but as more abstract data —such as text, spreadsheet entries, or vector graphics — it is necessary to generalize the tiling concept introduced in the previous section.
- the process of rendering a tile, once obtained, is trivial, since the information (once decompressed) is precisely the pixel-by-pixel contents ofthe tile.
- the speed bottleneck is normally the transfer of compressed data to the computer (e.g. downloading).
- the speed bottleneck is in the rendition of tiles; the information used to make the rendition may already be stored locally, or may be very compact, so that downloading no longer causes delay.
- tile rendition the understanding that this may be a slow process. Whether it is slow because the required data are substantial and must be downloaded over a slow connection or because the rendition process is itself computationally intensive is irrelevant.
- a complete zooming user interface combines these ideas in such a way that the user is able to view a large and possibly dynamic composite document, whose sub- documents are usually spatially non-overlapping. These sub-documents may in turn contain (usually non-overlapping) sub-sub-documents, and so on.
- documents form a tree, a structure in which each document has pointers to a collection of sub- documents, or children, each of which is contained within the spatial boundary ofthe parent document.
- a node borrowing from programming terminology for trees.
- drawing methods are defined for all nodes at all levels of detail, other methods corresponding to application-specific functionality may be defined only for certain nodes, and their action may be restricted only to certain levels of detail.
- some nodes may be static images which can be edited using painting-like commands, while other nodes may be editable text, while other nodes may be Web pages designed for viewing and clicking. All of these can coexist within a common large spatial environment — a "supernode” — which can be navigated by zooming and panning.
- zooming user interface There are a number of immediate consequences for a well-implemented zooming user interface, including: - - It is able to browse very large documents without downloading them in their entirety from the repository; thus even documents larger than the available short-term memory, or whose size would otherwise be prohibitive, can be viewed without limitation. - - Content is only downloaded as needed during navigation, resulting in optimally efficient use ofthe available bandwidth. - - Zooming and panning are spatially intuitive operations, allowing large amounts of information to be organized in an easily understood way. - - Since "screen space" is essentially unlimited, it is not necessary to minimize windows, use multiple desktops, or hide windows behind each other to work on multiple documents or views at once.
- documents can be arranged as desired, and the user can zoom out for an overview of all of them, or in on particular ones. This does not preclude the possibility of rearranging the positions (or even scales) of such documents to allow any combination of them to be visible at a useful scale on the screen at the same time. Neither does it necessarily preclude combining zooming with more traditional approaches.
- - - Because zooming is an intrinsic aspect of navigation, content of any kind can be viewed at an appropriate spatial scale.
- - - High-resolution displays no longer imply shrinking text and images to small (sometimes illegible) sizes; depending on the level of zooming, they either allow more content to be viewed at once, or they allow content to be viewed at normal size and higher fidelity.
- the client's first priority will be to fill in this "resolution hole”. If more than one level of detail is missing in the hole, then requests for all levels of detail with ⁇ 1, plus the next higher level of detail (to allow LOD blending — see #5), are queued in increasing order. At first glance, one might suppose that this introduces unnecessary overhead, because only the finest of these levels of detail is strictly required to render the current view; the coarser levels of detail are redundant, in that they define a lower-resolution image on the display. However, these coarser levels cover a larger area — in general, an area considerably larger than the display.
- the coarsest level of detail for any node in fact includes only a single tile by construction, so a client rendering any view of a node will invariably queue this "outermost" tile first.
- robustness we mean that the client is never "at a loss” regarding what to display in response to a user's parining and zooming, even if there is a large backlog of tile requests waiting to be filled.
- the client simply displays the best (i.e. highest resolution) image available for every region on the display. At worst, this will be the outermost tile, which is the first tile ever requested in connection with the node.
- tile requests are queued by increasing distance to the center ofthe screen, as shown in Figure 3.
- This technology is inspired by the human eye, which has a central region — the fovea — specialized for high resolution. Because zooming is usually associated with interest in the central region ofthe display, foveated tile request queuing usually reflects the user's implicit prioritization for visual information during inward zooms. Furthermore, because the user's eye generally spends more time looking at regions near the center ofthe display than the edge, residual blurriness at the display edge is less noticeable than near the center. The transient, relative increase in sharpness near the center ofthe display produced by zooming in using foveal tile request order also mirrors the natural consequences of zooming out — see Figure 4.
- the opacity ofthe new tile is a linear function of time since the tile became available, so that halfway through the fixed blend- in interval the new tile is 50% opaque), exponential, or follow any other interpolating function.
- every small constant interval of time corresponds to a constant percent change in the opacity; for example, the new tile may become 20% more
- FIG. 5 shows our simplest reference implementation for how each tile can be decomposed into rectangles and triangles, called tile shards, such that opacity changes continuously over each tile shard.
- Tile X bounded by the square aceg, has neighboring tiles L, R, T and B on the left, right, top and bottom, each sharing an edge. It also has neighbors TL, TR, BL and BR sharing a single comer. Assume that tile X is present. Its “inner square", iiii, is then fully opaque.
- Part (b) is a rectangle in which the opacities of two opposing edges are different; then the opacity over the interior is simply a linear interpolation based on the shortest distance of each interior point from the two edges.
- Part (c) shows a bilinear method for interpolating opacity over a triangle, when the opacities of all three comers abc may be different.
- every interior point/? subdivides the triangle into three sub-triangles as shown, with areas A, B and C.
- the opacity at j? is then simply a weighted sum ofthe opacities at the corners, where the weights are the fractional areas ofthe three sub-triangles (i.e.
- this strategy causes the relative level of detail visible to the user to be a continuous function, both over the display area and in time. Both spatial seams and temporal discontinuities are thereby avoided, presenting the user with a visual experience reminiscent of an optical instrument bringing a scene continuously into focus. For navigating large documents, the speed with which the scene comes into focus is a function ofthe bandwidth ofthe connection to the repository, or the speed of tile rendition, whichever is slower. Finally, in combination with the foveated prioritization of innovation #2, the continuous level of detail is biased in such a way that the central area ofthe display is brought into focus first. 5.
- Generalized linear-mipmap-linear LOD blending Generalized linear-mipmap-linear LOD blending.
- each tile shard has an opacity as drawn, which has been spatially averaged with neighboring tile shards at the same level of detail for spatial smoothness, and temporally averaged for smoothness over time.
- the target opacity is 100% if the level of detail undersamples the display, i.e. / ⁇ 1 (see #1).
- the target opacity is decreased linearly (or using any other monotonic function) such that it goes to zero if the oversampling is g-fold.
- this causes continuous blending over a zoom operation, ensuring that the perceived level of detail never changes suddenly.
- the number of blended levels of detail in this scheme can be one, two, or more. A number larger than two is transient, and caused by tiles at more than one level of detail not having been fully blended in temporally yet.
- a single level is also usually transient, in that it normally occurs when a lower-than-ideal LOL> is "standing in” at 100% opacity for higher LODs which have yet to be downloaded or constructed and blended in.
- the simplest reference implementation for rendering the set of tile shards for a node is to use the so-called “painter's algorithm": all tile shards are rendered in back-to- front order, that is, from coarsest (lowest LOD) to finest (highest LOD which oversamples the display less than g-fold).
- the target opacities of all but the highest LOD are 100%, though they may transiently be rendered at lower opacity if their temporal blending is incomplete.
- the highest LOD has variable opacity, depending on how much it oversamples the display, as discussed above.
- this reference implementation is not optimal, in that it may render shards which are then fully obscured by subsequently rendered shards. More optimal implementations are possible through the use of data structures and algorithms analogous to those used for hidden surface removal in 3D graphics. 6. Motion anticipation. During rapid zooming or panning, it is especially difficult for tile requests to keep up with demand. Yet during these rapid navigation patterns, the zooming or panning motion tends to be locally well-predicted by linear extrapolation (i.e. it is difficult to make sudden reversals or changes in direction).
- the present invention relates generally to multiresolution imagery. More specifically, the invention is a system and method for efficiently blending together visual representations of content at different resolutions or levels of detail in real time. The method ensures perceptual continuity even in highly dynamic contexts, in which the data being visualized may be changing, and only partial data may be available at any given time.
- the invention has applications in a number of fields, including (but not limited to) zooming user interfaces (ZUIs) for computers.
- ZUIs zooming user interfaces
- the invention applies in situations in which visual data can be obtained "on the fly” at different levels of detail, for example, from a camera with machine-controllable pan and zoom.
- the present invention is a general approach to the dynamic display of such multiresolution visual data on one or more 2D displays (such as CRTs or LCD screens).
- 2D displays such as CRTs or LCD screens.
- the wavelet decomposition of a large digital image e.g. as used in the JPEG2000 image format.
- This decomposition takes as its starting point the original pixel data, normally an array of samples on a regular rectangular grid. Each sample usually represents a color or luminance measured at a point in space corresponding to its grid coordinates. In some applications the grid may be very large, e.g.
- the image is first resized to a
- 18 image is resized to be 256x256 pixels, 128x128, 64x64, 32x32, 16x16, 8x8, 4x4, 2x2, and
- 21 granularity may change at different scales, but here, for example and without limitation,
- each level of detail into a grid, such that a grid square, or tile, is the basic unit of transmissicn.
- the size in pixels of each tile can be kept at or below a constant size, so that each increasing level of detail contains about four times as many tiles as the previous level of detail. Small tiles may occur at the edges ofthe image, as its dimensions may not be an exact multiple ofthe nominal tile size; also, at the lowest levels of detail, the entire image will be smaller than a single nominal tile.
- the 512x512 pixel image considered earlier has 8x8 tiles at its highest level of detail, 4x4 at the 256x256 level, 2x2 at the 128x128 level, and a single tile at the remaining levels of detail.
- the JPEG2000 image format includes the features just described for representing tiled, multiresolution and random-access images. If a detail of a large, tiled JPEG2000 image is being viewed interactively by a client on a 2D display of limited size and resolution, then some particular set of adjacent tiles, at a certain level of detail, are needed to produce an accurate rendition. In a dynamic context, however, these may not all be available.
- Tiles at coarser levels of detail often will be available, however, particularly if the user began with a broad overview of the image. Since tiles at coarser levels of detail span a much wider area spatially, it is likely that the entire area of interest is covered by some combination of available tiles. This implies that the image resolution available will not be constant over the display area.
- the edge regions of tiles reserved for blending are referred to as blending flaps.
- the simple reference implementation for displaying a finished composite image is a "painter's algorithm": all relevant tiles (that is, tiles overlapping the display area) in the coarsest level of detail are drawn first, followed by all relevant tiles in progressively finer levels of detail. At each level of detail blending was applied at the edges of incomplete areas as described. The result, as desired, is that coarser levels of detail "show through” only in places where they are not obscured by finer levels of detail.
- this simple algorithm works, it has several drawbacks: first, it is wasteful of processor time, as tiles are drawn even when they will ultimately be partly or even completely obscured.
- the painter's algorithm relies precisely on the effect of one "layer of paint” (i.e. level of detail) fully obscuring the one underneath; it is not known in advance where a level of detail will be obscured, and where not.
- the Invention resolves these issues, while preserving all the advantages ofthe painter's algorithm.
- One of these advantages is the ability to deal with any kind of LOD tiling, including non-rectangular or irregular tilings, as well as irrational grid tilings, for which I am filing a separate provisional patent application.
- Tilings generally consist of a subdivision, or tesselation, ofthe area containing the visual content into polygons.
- the areas of tiles at lower levels of detail be larger than the areas of tiles at higher levels of detail; the multiplicative factor by which their sizes differ is the granularity g, which we will assume (but without limitation) to be a constant.
- g which we will assume (but without limitation) to be a constant.
- an irrational but rectangular tiling grid will be used to describe the improved algorithm. Generalizations to other tiling schemes should be evident to anyone skilled in the art.
- the improved algorithm consists of four stages. In the first stage, a composite grid is constructed in the image's reference frame from the superposition ofthe visible parts of all ofthe tile grids in all ofthe levels of detail to be drawn.
- SUBSTITUTE ⁇ EET (RULE 26) there be n grid lines parallel to the .x-axis and m grid lines parallel to they-axis.
- n * m table With entries corresponding to the squares ofthe grid.
- Each grid entry has two fields: an opacity, which is initialized to zero, and a list of references to specific tiles, which is initially empty.
- the second stage is to walk through the tiles, sorted by decreasing level of detail (opposite to the na ⁇ ve implementation).
- Each tile covers an integral number of composite grid squares. For each of these squares, we check to see if its table entry has an opacity less than 100%, and if so, we add the current tile to its list and increase the opacity accordingly.
- the per-tile opacity used in this step is stored in the tile data structure.
- the composite grid will contain entries corresponding to the correct pieces of tiles to draw in each grid square, along with the opacities with which to draw these "tile shards". Normally these opacities will sum to one. Low-resolution tiles which are entirely obscured will not be referenced anywhere in this table, while partly obscured tiles will be referenced only in tile shards where they are partly visible.
- the third stage ofthe algorithm is a traversal ofthe composite grid in which tile shard opacities at the composite grid vertices are adjusted by averaging with neighboring vertices at the same level of detail, followed by readjustment ofthe vertex opacities to preserve the summed opacity at each vertex (normally 100%).
- This implements a refined version ofthe spatial smoothing of scale described in a separate provisional patent application. The refinement comes from the fact that the composite grid is in general denser than the 3x3 grid per tile defined in innovation #4, especially for low-resolution tiles.
- the composite gridding will be at least as fine as necessary.
- This allows the averaging technique to achieve greater smoothness in apparent level of detail, in effect by creating smoother blending flaps consisting of a larger number of tile shards.
- the composite grid is again traversed, and the tile shards are actually drawn.
- this algorithm involves multiple passes over the data and a certain amount of bookkeeping, it results in far better performance than the naive algorithm, because much less drawing must take place in the end; every tile shard rendered is visible to the user, though sometimes at low opacity. Some tiles may not be drawn at all.
- na ⁇ ve algorithm which draws every tile intersecting with the displayed area in its entirety.
- An additional advantage of this algorithm is that it allows partially transparent nodes to be drawn, simply by changing the total opacity target from 100% to some lower value. This is not possible with the na ⁇ ve algorithm, because every level of detail except the most detailed must be drawn at full opacity in order to completely "paint over" any underlying, still lower resolution tiles.
- the composite grid can be constructed in the usual manner; it may be larger than the grid would have been for the unrotated case, as larger coordinate ranges are visible along a diagonal.
- Another exemplary optimization is that the total opacity rendering left to do, expressed in terms of (area) x (remaining opacity), can be kept track of, so that the algorithm can quit early if everything has already been drawn; then low levels of detail need not be "visited” at all if they are not needed.
- the algorithm can be generalized to arbitrary polygonal tiling patterns by using a constrained Delaunay triangulation instead of a grid to store vertex opacities and tile shard identifiers.
- This data structure efficiently creates a triangulation whose edges contain every edge in all ofthe original LOD grids; accessing a particular triangle or vertex is an efficient operation, which can take place in of order n*log( ⁇ ) time (where n is the number of vertices or triangles added).
- n is the number of vertices or triangles added.
- the resulting triangles are moreover the basic primitive used for graphics rendering on most graphics platforms.
- FIGURE 1 A first figure.
- the present invention is directed to methods and apparatus for the application of image navigation techniques in advancing commerce, for example, by way of providing new environments for advertising and purchasing products and/or services.
- mapping and geospatial applications are a booming industry. They have been attracting rapidly increasing investment from businesses in many different markets — from candidates like Federal Express, clothing stores and fast food chains. In the past several years, mapping has also become one ofthe very few software applications on the web that generate significant interest (so-called "killer apps"), alongside search engines, web-based email, and matchmaking.
- mapping should in principle be highly visual, at the moment its utility for end users lies almost entirely in generating driving directions.
- the map images which invariably accompany the driving directions are usually poorly rendered, convey little information, and cannot be navigated conveniently, making them little more than window dressing. Clicking on a pan or zoom control causes a long delay, during which the web browser becomes unresponsive, followed by the appearance of a new map image bearing little visual relationship to the previous image.
- computers should be able to navigate digital maps more effectively than we navigate paper atlases, in practice visual navigation of maps by computer is still inferior.
- the present invention is intended to be employed in combination with a novel technology permitting continuous and rapid visual navigation of a map (or any other image), even over a low bandwidth connection.
- This technology relates to new techniques for rendering maps continuously in a panning and zooming environment. It is an application of fractal geometry to line and point rendering, allowing networks of roads (ID curves) and dots marking locations (OD points) to be drawn at all scales, producing the illusion of continuous physical zooming, while still keeping the "visual density" ofthe map bounded.
- Related techniques apply to text labels and iconic content. This new approach to rendition avoids such effects as the sudden appearance or disappearance of small roads during a zoom, an adverse effect typical of digital map drawing. The details of this navigation technology may be found in U.S.
- GIS geographical information service
- the capabilities ofthe new navigation techniques ofthe present invention are described in detail in the aforementioned U.S. patent application.
- the most relevant aspects ofthe base technology are: - smooth zooming and panning through a 2D world with perceptual continuity and advanced bandwidth management; - an infinite-precision coordinate system, allowing visual content to be nested without limit; - the ability to nest content stored on many different servers, so that spatial containment is equivalent to a hyperlink.
- a map consists of many layers of information; ultimately, the Voss map application will allow the user to turn most of these layers on and off, making the map highly customizable.
- Layers include: 1. roads; 2. waterways; 3. administrative boundaries; 4. aerial photography-based orthoimagery (aerial photography which has been digitally "unwarped” such that it tiles a map perfectly); 5. topography; 6. public infrastructure locations, e.g. schools, churches, public telephones, restrooms;
- the most salient layers from the typical user's point of view are 1-4 and 7.
- the advertising/user content layers 10-11 which are of particular interest in this patent application are also of significant interest.
- Many ofthe map layers — including 1-7 — are already available, at high quality and negligible cost, from the U.S. Federal Government. Value-added layers like 8-9 (and others) can be made available at any time during development or even after deployment.
- GIS geographic information service
- national Yellow Pages/White Pages data may also be valuable in implementing the present invention. This information may also be licensed. National Yellow Pages/White Pages data may be used in combination with geocoding to allow geographical user searches for businesses, or filtering (e.g. "highlight all restaurants in Manhattan"). Perhaps most importantly, directory listings combined with geocoding will greatly simplify associating business and personal users with geographic locations, allowing "real estate" to be rented or assigned via an online transaction and avoiding the need for a large sales force.
- the "neartime” data are updated at least every 90 days. Combined with 90-day caching of entries already obtained on our end, this is a very economical way to obtain high-quality national listings.
- "Realtime” data, updated nightly, are also available, but are more expensive ($0.20/hit).
- the realtime data are identical to those used by 411 operators.
- the Voss mapping application requires both downloadable client software and generates revenue through advertising, it will not suffer the disadvantages of classic advertising-based business models. Even before any substantial commercial space has been "rented", the present invention will provide a useful and visually compelling way of viewing maps and searching for addresses — that is, similar functionality to that of existing mapping applications, but with a greatly improved visual interface. Furthermore, the approach ofthe present invention provides limited but valuable service to non-commercial users free of charge to attract a user base. The limited service consists of hosting a small amount (5-15 MB) of server space per user, at the user's geographical location — typically a house.
- the client software may include simple authoring capabilities, allowing users to drag and drop images and text into their "physical address", which can then be viewed by any other authorized user with the client software. (Password protection may be available.) Because the zooming user interface approach is of obvious benefit for navigating digital photo collections — especially over limited bandwidth — the photo album sharing potential alone may attract substantial numbers of users. Additional server space may be available for a modest yearly fee. This very horizontal market is likely to be a major source of revenue.
- the sources of revenue may include: 1. Commercial "rental” of space on the map corresponding to a physical address; 2. Fees for "plus services” (defined below) geared toward commercial users; 3. Fees for "plus services” geared toward non-commercial users; 4. Professional zoomable content authoring software; 5. Licensing or partnerships with PDA, cell phone, car navigation system, etc. vendors and service providers; 6. Information.
- Basic commercial rental of space on a map can be priced using a combination ofthe following variables: 1. Number of sites on the map; 2. Map area ("footprint") per site, in square meters;
- Focusing priority allows commercial content to come into focus faster than it would otherwise, increasing its prorninence in the user's "peripheral vision". This feature will be tuned to deliver commercial value without compromising the user's navigation experience.
- 3. Including a conventional web hyperlink in the zoomable content these may be clearly marked (e.g., with the conventional underlined blue text) and, on the user's click, open a web browser. We can either charge for including such a hyperlink, or, like Google, charge per click.
- Making the geographic area rented refer to an outside commercial server, which will itself host zoomable content of any type and size — this is a fancier version of #3, and allows any kind of e-business to be conducted via the map.
- Billboards as in real life, many high- visibility areas of the map will have substantial empty space. Companies can buy this space and insert content, including hyperlinks and "hyperjumps", which if clicked will make the user jump through space to a commercial site elsewhere on the map. In contrast to ordinary commercial space, billboard space need not be rented at a fixed location; its location can be generated on the fly during user navigation.
- zoomable Voss content will be possible from within the free client. This will include inserting text, dragging and dropping digital photos, and setting
- Professional authoring software may be a modified version ofthe client designed to allow more flexible zoomable content creation, as well as facilities for making hyperlinks and hyperjumps, and inserting custom applets.
- Use ofthe present invention may generate a great deal of aggregate and individual information on spatial attention density, navigation routes and other patterns. These data are of commercial value.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Business, Economics & Management (AREA)
- Educational Administration (AREA)
- Educational Technology (AREA)
- Computer Hardware Design (AREA)
- Life Sciences & Earth Sciences (AREA)
- Ecology (AREA)
- Human Computer Interaction (AREA)
- Image Processing (AREA)
- Editing Of Facsimile Originals (AREA)
- Processing Or Creating Images (AREA)
Abstract
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CA2558833A CA2558833C (en) | 2004-03-17 | 2005-03-16 | Methods and apparatus for navigating an image |
JP2007504079A JP4861978B2 (en) | 2004-03-17 | 2005-03-16 | Method and apparatus for navigating images |
EP05740967A EP1759354A2 (en) | 2004-03-17 | 2005-03-16 | Methods and apparatus for navigating an image |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US55380304P | 2004-03-17 | 2004-03-17 | |
US10/803,010 US7133054B2 (en) | 2004-03-17 | 2004-03-17 | Methods and apparatus for navigating an image |
US10/803,010 | 2004-03-17 | ||
US60/553,803 | 2004-03-17 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2005089403A2 true WO2005089403A2 (en) | 2005-09-29 |
WO2005089403A3 WO2005089403A3 (en) | 2009-02-26 |
Family
ID=34994319
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2005/008812 WO2005089403A2 (en) | 2004-03-17 | 2005-03-16 | Methods and apparatus for navigating an image |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP1759354A2 (en) |
JP (1) | JP4861978B2 (en) |
CA (1) | CA2558833C (en) |
WO (1) | WO2005089403A2 (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008059582A (en) * | 2006-08-29 | 2008-03-13 | Samsung Electronics Co Ltd | Level of detail value calculating method for reducing power consumption, and 3-dimensional rendering system using the same |
WO2010043959A1 (en) * | 2008-10-15 | 2010-04-22 | Nokia Corporation | Method and apparatus for generating an image |
EP2146861B1 (en) * | 2007-04-17 | 2012-12-26 | Volkswagen Aktiengesellschaft | Display device for a vehicle for the display of information relating to the operation of the vehicle and method for the display of the information thereof |
US8935292B2 (en) | 2008-10-15 | 2015-01-13 | Nokia Corporation | Method and apparatus for providing a media object |
EP2556490A4 (en) * | 2010-04-05 | 2017-06-28 | Microsoft Technology Licensing, LLC | Generation of multi-resolution image pyramids |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102332079B (en) * | 2011-09-16 | 2013-12-04 | 南京师范大学 | GIS (geographic information system) vector data disguising and restoring method based on error random interference |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030142099A1 (en) * | 2002-01-30 | 2003-07-31 | Deering Michael F. | Graphics system configured to switch between multiple sample buffer contexts |
US20030156738A1 (en) * | 2002-01-02 | 2003-08-21 | Gerson Jonas Elliott | Designing tread with fractal characteristics |
US20030231190A1 (en) * | 2002-03-15 | 2003-12-18 | Bjorn Jawerth | Methods and systems for downloading and viewing maps |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000046566A (en) * | 1998-07-29 | 2000-02-18 | Aisin Aw Co Ltd | Map display device and storage medium |
JP2002245473A (en) * | 2001-02-16 | 2002-08-30 | Hitachi Eng Co Ltd | Method and device for map display |
DE10226885A1 (en) * | 2002-06-17 | 2004-01-08 | Herman/Becker Automotive Systems (Xsys Division) Gmbh | Method and driver information system for displaying a selected map section |
-
2005
- 2005-03-16 WO PCT/US2005/008812 patent/WO2005089403A2/en not_active Application Discontinuation
- 2005-03-16 JP JP2007504079A patent/JP4861978B2/en active Active
- 2005-03-16 EP EP05740967A patent/EP1759354A2/en not_active Withdrawn
- 2005-03-16 CA CA2558833A patent/CA2558833C/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030156738A1 (en) * | 2002-01-02 | 2003-08-21 | Gerson Jonas Elliott | Designing tread with fractal characteristics |
US20030142099A1 (en) * | 2002-01-30 | 2003-07-31 | Deering Michael F. | Graphics system configured to switch between multiple sample buffer contexts |
US20030231190A1 (en) * | 2002-03-15 | 2003-12-18 | Bjorn Jawerth | Methods and systems for downloading and viewing maps |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008059582A (en) * | 2006-08-29 | 2008-03-13 | Samsung Electronics Co Ltd | Level of detail value calculating method for reducing power consumption, and 3-dimensional rendering system using the same |
EP2146861B1 (en) * | 2007-04-17 | 2012-12-26 | Volkswagen Aktiengesellschaft | Display device for a vehicle for the display of information relating to the operation of the vehicle and method for the display of the information thereof |
WO2010043959A1 (en) * | 2008-10-15 | 2010-04-22 | Nokia Corporation | Method and apparatus for generating an image |
CN102187369A (en) * | 2008-10-15 | 2011-09-14 | 诺基亚公司 | Method and apparatus for generating an image |
US8935292B2 (en) | 2008-10-15 | 2015-01-13 | Nokia Corporation | Method and apparatus for providing a media object |
US9218682B2 (en) | 2008-10-15 | 2015-12-22 | Nokia Technologies Oy | Method and apparatus for generating an image |
US9495422B2 (en) | 2008-10-15 | 2016-11-15 | Nokia Technologies Oy | Method and apparatus for providing a media object |
US10445916B2 (en) | 2008-10-15 | 2019-10-15 | Nokia Technologies Oy | Method and apparatus for generating an image |
EP2556490A4 (en) * | 2010-04-05 | 2017-06-28 | Microsoft Technology Licensing, LLC | Generation of multi-resolution image pyramids |
Also Published As
Publication number | Publication date |
---|---|
CA2558833C (en) | 2014-12-30 |
JP2008501160A (en) | 2008-01-17 |
EP1759354A2 (en) | 2007-03-07 |
JP4861978B2 (en) | 2012-01-25 |
CA2558833A1 (en) | 2005-09-29 |
WO2005089403A3 (en) | 2009-02-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA2812008C (en) | Methods and apparatus for navigating an image | |
AU2006230233B2 (en) | System and method for transferring web page data | |
JP4831071B2 (en) | System and method for managing communication and / or storage of image data | |
WO2005089434A2 (en) | Method for encoding and serving geospatial or other vector data as images | |
US7023456B2 (en) | Method of handling context during scaling with a display | |
US7075535B2 (en) | System and method for exact rendering in a zooming user interface | |
JP4410465B2 (en) | Display method using elastic display space | |
US7287220B2 (en) | Methods and systems for displaying media in a scaled manner and/or orientation | |
US6674445B1 (en) | Generalized, differentially encoded, indexed raster vector data and schema for maps on a personal digital assistant | |
CN101501664A (en) | System and method for transferring web page data | |
US20070064018A1 (en) | Detail-in-context lenses for online maps | |
WO2008054805A2 (en) | Method of client side map rendering with tiled vector data | |
CA2558833C (en) | Methods and apparatus for navigating an image | |
Möser et al. | Context aware terrain visualization for wayfinding and navigation | |
JP2008535098A (en) | System and method for transferring web page data | |
US20090172570A1 (en) | Multiscaled trade cards | |
KR20030015765A (en) | Method and system for providing panorama-typed images on the internet | |
Perlin et al. | Live Paint | |
CA2425990A1 (en) | Elastic presentation space |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2558833 Country of ref document: CA |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2005740967 Country of ref document: EP |
|
NENP | Non-entry into the national phase in: |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2007504079 Country of ref document: JP |
|
WWW | Wipo information: withdrawn in national office |
Country of ref document: DE |
|
WWP | Wipo information: published in national office |
Ref document number: 2005740967 Country of ref document: EP |