System and Method for Producing a Flexible Geographical Grid
Cross-Reference to Related Applications
[001] This application claims the benefit of the filing date of U.S. Provisional Patent Application entitled "System and Method for Producing a Flexible Geographical Grid," Attorney Docket No. 26152-542-301, filed July 30, 2004. This application is related to U.S. Patent Application No. 10/797,143, entitled "Systems and Methods for Determining Concentrations of Exposure," filed on March 11, 2004. Both of these applications are incorporated herein by reference in their entirety.
[002] The invention relates to a system and method for producing a flexible mesh for use by a computer-implemented system or method for dividing a data population into subdivisions having respective resolutions dependent on the data density and/or relevance in the region of respective subdivisions. The mesh may be two dimensional or multi-dimensional depending on the dimensions of the data population. In one example the invention relates to a system and method for producing a flexible geographical grid for analyzing one or more geographical models including exposure information to identify potential risks.
[003] Catastrophe modeling technology has become a vital tool for quantifying, managing, and transferring risk, particularly in the insurance industry. Any company with financial assets exposed to catastrophes or other loss can benefit from catastrophe modeling. Insurers, reinsurers, brokers, financial markets, corporations, and others have all recognized the need to employ quantitative models based on the synthesis of available scientific research to evaluate the probability of financial loss.
[004] Using computerized models, underwriters price risk based on the evaluation of the probability of loss for a particular location and property type as well as manage portfolios of risks according to the degree to which losses correlate from one location to another as part of the same catastrophe event. These probabilistic (stochastic) catastrophe models include, but are not limited to, earthquake, fire following earthquake, tropical/cyclone (hurricanes, typhoons, and cyclones), extra-tropical cyclone (windstorm), storm-surge, river flooding,
tornadoes, hailstorms, terrorism and other types of catastrophe events. These catastrophe models are built upon detailed geographical databases describing highly localized variations in hazard characteristics, as well as databases capturing property and casualty inventory, building stock, and insurance exposure information. [005] Modeling systems using these models allow catastrophe managers, analysts, underwriters and others in the insurance markets (and elsewhere) to capture exposure data, to analyze risk for individual accounts or portfolios, to monitor risk aggregates, and to set business strategy. Typical catastrophe modeling systems are built around a geographical model comprising exposure information for specific bounded locations or areas. These locations or areas of interest are typically defined using for example, postal code boundaries, including ZIP codes, city (or other administrative) boundaries, electoral or census ward boundaries and similar bounded geographical subdivisions.
[006] A drawback of using these types of mechanisms {e.g. , postal boundaries, cities, municipalities, building Ids, post codes or zip codes) to define locations or areas is that they tend to change over time.
[007] Another drawback of these types of mechanisms to define locations or areas is that they do not allow the system or user the flexibility to select different resolutions. In addition, it may be very difficult to identify a single location that characterizes the risk of the whole geographic area. [008] These and other drawbacks exist.
[009] An aspect of the invention overcomes or at least ameliorates the drawbacks of the conventional modeling methods and systems.
[010] Viewed from another aspect the invention provides a system and method for supplying risk assessment information using geographical and exposure data. [Oil] Viewed from another aspect the invention provides a system and method for the flexible modeling of global geographic data.
[012] Viewed from another aspect of the invention provides a system and method for optimizing the resolution used for modeling geographical and hazard data.
[013] Viewed from a yet another aspect the invention provides a system and method for optimizing a data analysis resolution depending on the density and/or importance of the data points in the region of the data population under analysis.
[014] Embodiments of the present invention relates to a system and method of providing a variable resolution grid (VRG). A variable resolution grid provides a way of locating and focusing specific concentrations of risk exposure on a geographical grid to determine projected loss caused by a particular catastrophe. An exposure may be defined as the potential financial liability that may be incurred by a party or parties. A catastrophe may be natural, man-made, or a combination of the two. Some examples of catastrophes may include earthquakes, fires, tornados, hurricanes, typhoons, flooding, blizzards, hailstorms, windstorms, nuclear meltdowns, terrorist acts, labor strikes, war, or other catastrophe.
[015] An embodiment of the invention creates a stable base map by using latitudes and longitudes to define the grid points. This enables the same geographical base map to be used for determining projected loss for more than one catastrophe. [016] In accordance with one embodiment the invention enables the use of a wide range of resolutions determined by the number of times the latitude and longitude grid is subdivided. A higher resolution provides a more detailed geographical representation. However, a lower resolution provides for more efficient data storage.
[017] Geocoding is the process of subdividing the latitude and longitude grid into a predetermined number of cells based on the desired number of possible resolutions. A single geographic identifier (also referred to as a GEOID) is created for every geocoded point with a precision of up to 0.0001 degrees (approximately 10 meters). In an illustrative embodiment, eight lower resolutions will also be considered: 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, and 1.0 degree(s). These resolutions correspond to precise subdivisions of a single degree of latitude and longitude. Thus, when modeling the geographical data, an optimal resolution may be chosen from the number of different resolutions provided by the geocoding process.
[018] According to an embodiment of the invention, each GEOID is represented by a number that determines where in a coordinate system the variable resolution grid cell (VRG-
cell) is located. The higher resolution geographic identifiers have more digits in order to describe the variable resolution grid cell's location in relation to a greater number of cells or subdivisions.
[019] Some embodiments of the invention generally relate to a method of assigning geographic identifiers to high or low-resolution locations during a geocoding process then matching the geographic identifiers with variable resolution exposure data during the exposure data retrieval process.
[020] During exposure data retrieval, an exposure data retrieval engine retrieves a set of GEOIDs corresponding to the one or more exposures being analyzed, one for each variable resolution grid resolution level. These sets of VRG-cell GEOIDs are parsed, rearranged, and concatenated into a "family" of GEOIDs that correspond to different resolutions. The GEOIDs are then queried as a group against available exposure data. In some cases, the exposure data retrieval engine may find a data entry corresponding to more than one GEOID at one or more resolutions. The exposure data retrieval engine selects the highest value GEOID. Because the length of the GEOID is directly related to the resolution, the resolution of the selected GEOID represents an optimal resolution at which the exposure data can be represented.
[021] The exposure data retrieval engine can then store all or some of the available resolution levels at which it found a match in a priority string. Thus, the system can search the stored available resolutions the next time the system searches for exposure data. If exposure data retrieval fails at any level, it falls back to the next available resolution level that has been enabled and searches again for exposure data.
[022] In one embodiment of the invention, the retrieval process retrieves hazard data. Hazard data includes peril-specific tables of hazard values for each variable resolution grid cell's GEOID. Perils can include man made and natural catastrophes, as well as combinations thereof. The hazard values are weighted averages of hazards distributed across the entire area of the corresponding cell. Each peril specific table may have hazard data entries at different variable resolution grids.
[023] During hazard retrieval, a hazard retrieval engine creates a set of VRG-cell GEOIDs, one for each VRG resolution level. These sets of VRG-cell GEOIDs are parsed, rearranged, and concatenated into a "family" of GEOIDs that represent different resolutions. The GEOIDs are then queried as a group against available hazard data. In some cases, the hazard engine will find a hazard data entry corresponding to more than one VRG-cell GEOID. The hazard engine selects the highest value GEOID. Because the length of the GEOID is directly related to the resolution, the resolution of the selected VRG-cell ID represents the optimal resolution.
[024] The hazard retrieval engine then stores all the available resolution levels at which it found a match in a priority string. This enables the stored available resolutions to be searched first for hazard data. If hazard retrieval fails at any level, it falls back to the next available resolution level that has been enabled in the priority string and searches again for hazard data.
[025] It should be recognized that the variable resolution grid can be used with other types of geographical data that correspond to different locations on a location grid. More generally, a variable resolution grid can be used to analyse any data populations represented in multiple dimensional space.
[026] Other objects, advantages, and embodiments of the invention are set forth in part in the description that follows and in part will be apparent to one of ordinary skill in the art. [027] The following description of embodiments of the present invention is by way of example only, and is made with reference to the accompanying drawings, in which:
[028] FIG. 1 illustrates various resolutions of a variable resolution grid according to an embodiment of the invention
[029] FIG. 2 illustrates a Cartesian coordinate frame describing an assignment of a quadrant prefix for a geocode.
[030] FIG. 3 illustrates a Cartesian coordinate frame describing an assignment of a sub- quadrant suffix for a geocode.
[031] FIG. 4 illustrates an operation of a system utilizing a variable resolution grid according to various embodiments of the invention.
[032] An embodiment of the invention relates generally to a system and method of providing a Variable Resolution Grid (VRG). A VRG provides a way of locating specific concentrations of exposures on a geographical grid to determine a projected loss caused by a particular catastrophe. An exposure may be defined as the potential financial liability that may be incurred by a party or parties. A catastrophe can be natural, man-made or a combination of the two. Some examples of catastrophes may include earthquakes, fires, tornados, hurricanes, typhoons, flooding, blizzards, hailstorms, windstorms, nuclear meltdowns, terrorist acts, labor strikes, war, or other catastrophes.
[033] An embodiment of the invention creates a stable base map using latitudes and longitudes to define the grid points. This enables the same geographical base map to be used for determining projected loss for more than one catastrophe.
[034] An embodiment of the invention enables the use of a wide range of resolutions determined by the number of times the latitude and longitude grid is subdivided. A higher resolution provides a more detailed geographical representation. However, a lower resolution provides for more efficient data storage.
[035] This embodiment of the invention relates to a system and method of providing a Variable Resolution Grid (VRG). A VRG provides a way of focusing specific concentrations of exposures on a geographical grid to determine projected loss caused by a particular catastrophe. An exposure may be defined as the potential financial liability that may be incurred by a party or parties. A catastrophe may be natural, man-made, or some combination of the two. Some examples of catastrophes may include, but are not limited to, earthquakes, fires, tornados, hurricanes, typhoons, flooding, blizzards, hailstorms, windstorms, nuclear meltdowns, terrorist acts, labor strikes, war, or other catastrophe.
[036] Geocoding is the process of subdividing a latitude and longitude grid into a predetermined number of cells based on the desired number of possible resolutions. A single geographic identifier (also referred to as a GEOID) is created for every geocoded point with a precision of up to 0.0001 degrees (approximately 10 meters). In an illustrative embodiment, eight lower resolutions may also be utilized, namely 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, and 1.0 degree(s). The lower resolutions include four "standard" levels (0.001,
0.01, 0.1, and 1.0 degree) and four "intermediate" levels differing by the next resolution by 50% (0.0005, 0.005, 0.05, 0.5 degree). These resolutions correspond to precise subdivisions of a single degree of latitude and longitude. Thus, when modeling the geographical data, an optimal resolution is chosen from the number of different resolutions provided by the geocode based on the resolution of the geographic model.
[037] FIG. 1 illustrates a VRG framework 100 according to one embodiment of the invention. VRG framework 100 includes cells at various resolutions including a 1 degree cell 110, a 0.5 degree cell 120, a 0.1 degree cell 130, a 0.05 degree cell 140. While not illustrated in FIG. 1, VRG FRAMEWORK 100 may include additional resolutions including 0.01, 0.005, 0.001, 0.0005, and 0.0001 as noted above.
[038] In an illustrative embodiment, a GEOID for a cell at a particular longitude and latitude pair may be represented as: RRR YY X, Y, X2Y2 X3Y3 (Q) where: XXX.X]X2X3X4 is the absolute value of longitude expressed to four decimal places; YYNιY2Y3Y4 is the absolute value of latitude expressed to four decimal places; RRR (also referred to as a quadrant prefix) is a sum (expressed with three digits) of a value V assigned to a quadrant in which the cell is located (see FIG. 2) and the whole portion of the longitude coordinate (XXX); V is from the set {100, 300, 500, 700} assigned to the quadrant in which the cell is located; YY represents two digits corresponding to the whole portion of the latitude coordinate; Xi Yi represents two digits, one corresponding to the tenths component of the longitude coordinate (Xi) and one corresponding to the tenths component of the latitude component (Yi); X2Y2 represents two digits, one corresponding to the hundredths component of the longitude coordinate (X2) and one corresponding to the hundredths component of the latitude component (Y2);
X3Y3 represents two digits, one corresponding to the thousandths component of the longitude coordinate (X3) and one corresponding to the thousandths component of the latitude component (Y3); X4Y4 represents two digits, one corresponding to the ten-thousandths component of the longitude coordinate ( and one corresponding to the ten- thousandths component of the latitude component (Y4); and Q (also referred to as a sub-quadrant suffix) represents an intermediate "sub-quadrant" of the next lowest-resolution standard grid and has a value from the set {1, 3, 5, 7} assigned to that sub-quadrant (see FIG. 3).
[039] According to various embodiments of the invention, quadrant prefix RRR is selected as a space-saving mechanism, because the first digit of the 3-digit X coordinate can only be 0 or 1 (the range of RRR is {100-880}, as the range of XXX is {0 - 180}). Both quadrant prefix RRR and sub-quadrant suffix Q are based on a modified division of the standard Cartesian plane, such as that illustrated in FIG. 2 and FIG. 3, respectively. As an example of the above embodiment of the invention, the GEOIDs corresponding to two coordinate pairs, representing high-resolution geocode matches in South Carolina (SC) and Hawaii (HI) are set forth in Table I.
[040] In this example, each high-resolution grid cell location is assigned a grid code with up to thirteen digits during the geocoding process, regardless of the presence of supporting geographical modeling data at the particular resolution. In some embodiments of the invention, the coordinates correspond to the corner of the grid cell closest to the origin. Other embodiments may select other locations on the perimeter or interior of the grid cell as would be apparent. As illustrated below in Table II, these multiple resolutions may reveal themselves in the length of the corresponding codes: the higher resolution cells will have more digits (and hence a larger "value") than the lower resolution cells. Furthermore, in
some embodiments of the invention, GEOID 's for "standard" resolutions will have an odd number of digits, while "intermediate" GEOID' s will have an even number.
[041] The principle behind the quadrant prefix RRR and the sub-quadrant suffix Q is similar, depending on the four possible combinations of longitude X and latitude Y. The quadrant prefix RRR depends on whether the longitude and latitude are positive or negative (i.e., greater than or less than zero). In some embodiments of the invention, all U.S. locations are in quadrant 3, so that the absolute value of longitude X increases from right to left. The sub-quadrant suffix Q depends on whether the dropped fractional components of longitude and latitude components Xj and Yj are greater than (>=) or less than (<) five. In some embodiments of the invention, sub-quadrant suffix Q is always odd, while the first digit of the quadrant prefix RRR can be even.
[042] In some embodiments, the precision of geocoded coordinates is 10 to 20 times the most detailed available VRG resolution. This is to account for rounding errors in a host computer's microprocessor. For example, resolution level-6 (0.005 degrees) or level-7 (0.001 degrees) require at least four decimal places to reliably assign GEOIDs. [043] For each geocoded location, a hazard-retrieval engine creates a set of thirteen GEOID' s, one for each VRG resolution level, and then searches for a match with the GEOID' s in the appropriate hazard table. As discussed above, the hazard data may include a plurality of peril-specific tables. In some embodiments of the invention, if hazard data is available at more than one GEOID corresponding to one or more resolution levels, then the process will select the data corresponding to the highest- value (e.g., highest resolution)
GEOID. Because the length of the GEOID is directly related to the VRG resolution, this will represent an optimal resolution for that location. Thus, the process selects the best available resolution at which the hazard data can be represented.
[044] In some embodiments of the invention, in order to minimize the number of searches for nonexistent data, an indicator, such as a character string in the registry, may specify one or more resolution levels of the available hazard data.
[045] As part of the hazard retrieval process, some portion of the assigned thirteen digit GEOID may be matched with the data (e.g., records, entries, etc.) in a series of hazard data tables. The hazard data tables can include a series of peril-specific data tables stored in peril specific databases. For example, a peril specific database may be given the name xxVHAZpp, where xx represents a two character country ID and pp represents a two character peril ID.
[046] In some embodiments of the invention, each variable peril is represented by only one table including records for all relevant VRG resolutions. The typical VRG peril specific data table may include one or more hazard specific data values for each GEOID as would be apparent.
[047] Embodiments of the present invention can be used in a variety of types of exposure accumulation analyses including Specific Area Analysis, Damage Footprint Analysis, "Spider" Analysis, and Building Level Analysis. Various ones of these are described in U.S.
Patent Application No. 10/797,143, entitled "Systems and Methods for Determining Concentrations of Exposure," filed on March 11 , 2004, which is herein incorporated by reference. Other types of "what-if ' analyses may be used.
[048] Specific Area Analysis and Damage Footprint Analysis each enable a user to define an area of interest and run one or more algorithms on the selected area(s) to determine exposure information in a known manner. For example, the user may define the area of interest by plotting a circle using a specified radius around a specified latitude and longitude. The analysis returns the accumulation (exposure concentration) in the radius around the selected point. In Damage Footprint Analysis, the user may define the area of interest. Instead of assuming uniform loss to all locations within the circle, the level of loss rises towards the centre and can be represented as a series of concentric rings. Each ring may represent a different loss or exposure concentration. A Spider Analysis may enable a user to determine areas in a region to determine where exposure accumulation meeting certain criteria exist (e.g. damage is above a certain level). For example, the user can use 100% loss and perform an analysis that would return all the areas within a region having 100% loss.
[049] A building level analysis may be used to analyze one or more selected buildings. These and other analyses may be used with the invention.
[050] The output selection may be configured to write out more detail than for a traditional catastrophe analysis. For example, when a user performs an accumulation analysis on a portfolio, loss information can be output for the portfolio, for each account in the portfolio, and for each location. Such output selections permit viewing results, generating maps, and reports.
[051] FIG. 4 illustrates an operation of a system that utilities a variable resolution grid according to various embodiments of the present invention. In an operation 410, geocodes are generated at each of one or more resolutions within a particular area or region. In some embodiments, geocodes are generated for each of the possible resolutions. In some embodiments, geocodes are generated for those available for the corresponding geographic modeling data. In an operation 420, the generated geocode is used to retrieve hazard data from a hazard table for that particular geocode at a corresponding resolution. In some
embodiments of the invention, hazard data is retrieved at each of the resolutions for the particular generated geocode. In other embodiments, hazard data is retrieved for the geocode corresponding to the highest resolution of the hazard data available. In some embodiments of the invention, if no hazard data is found for a particular geocode, a next lower resolution associated with that geocode is determined and that determined geocode is used to retrieve hazard data. In an operation 430, the hazard data at the particular location and resolution corresponding to the geocode is used to assess potential risk as would be apparent.
[052] While particular embodiments of the invention have been described, it is to be understood that modifications will be apparent to those skilled in the art without departing from the spirit of the invention. The scope of the invention is not limited to the specific embodiments described herein. Other embodiments, uses and advantages of the invention will be apparent to those skilled in art from consideration of the specification and practice of the invention disclosed herein. The specification should be considered illustrative only, and the scope of the invention is accordingly intended to be limited by the following claims. [053] In one embodiment the resolution of grid size is automated by configuring data processing apparatus such as a computer system with a series of rules and applying those rules to regions of a data population for example data points laid out over a geographical grid. The rules are based on variations which may occur in key data parameter, and/or on the importance of data parameters for a region of the data population. [054] In one example of analysing hazard risk, for example flood risk, a key data parameter on a geographical grid is elevation. A rule is set up which identifies these regions of a data population which have potential for flooding. For those regions a rule is set which defines a threshold elevation below which flood hazard is a factor. A further rule may be set which seeks to minimize the number of relevant elevation points (i.e. elevations below threshold) in a grid cell. This may be achieved by starting with a large default resolution, and gradually reducing grid resolution for a region in which more than one relevant elevation point exists, or the highest grid resolution is reached.
[055] A further rule may also be applied in which financial value data points over the grid are analysed. Such a rule may determine that in a region where no relevant financial data points (i.e. values over a threshold) exist, no increase in resolution is necessary at all.
[056] The computer system automatically applies these rules to the grid and data population to derive a varied resolution grid across the region of interest. The resolution of the grid is highest in regions where there are relevant financial data points and elevations below the threshold elevation. A relevant financial data point may be the cost of a property, or the life insurance of residents.
[057] One way of determining a residential density is to look at the spacing between post- or zip-codes, another is satellite imagery.
[058] An example of varying data points of concern would be wind speeds which may vary rapidly according to terrain and proximity to a coastline.
[059] By varying grid resolution such that high resolution is only used where there are data points of concern, importance or interest, GEOIDS such as described above may be used to optimize storage space and information retrieval times. This is a lightly efficient way of storing geographic data (in general location data), such that the resolution is relevant to the importance of the underlying data to the analysis.
[060] For embodiments in which the principles of the present invention are applied to multi-dimensional data systems, e.g. 3-dimensional data systems, spheres and cubes and other 3-dimensional objects are used to plot areas of interest. For higher dimensional data systems, suitable multi-dimensional objects may be used.
[061] Embodiments of the invention may be used to analyse data type distribution in other applications, for example in epidemiological studies, pollution studies and economic studies, where it is desirable to have a higher resolution for areas of interest. [062] Insofar as embodiments of the invention described above are implementable, at least in part, using a computer system, it will be appreciated that a computer program for implementing at least part of the described methods and/or the described systems, is envisaged as an aspect of the present invention. The computer system may be any suitable apparatus, system or device. For example, the computer system may be a programmable data
processing apparatus, a general purpose computer, a Digital Signal Processor or a microprocessor. The computer program may be embodied as source code and undergo compilation for implementation on a computer, or may be embedded as object code, for example. [063] Suitably, the computer program can be stored on a carrier medium in computer usable form, which is also envisaged as an aspect of the present invention. For example, the carrier medium may be solid-state memory, optical or magneto-optical memory such as a readable and/or writable disk for example a compact disk and a digital versatile disk, or magnetic memory such as disc or tape, and the computer system can utilise the program to configure it for operation. The computer program may be supplied from a remote source embodied in a carrier medium such as an electronic signal, including radio frequency carrier wave or optical carrier wave.