WO2002017120A2

WO2002017120A2 - Generating and providing information about expected future prices of assets and visualization of asset information

Info

Publication number: WO2002017120A2
Application number: PCT/US2001/025753
Authority: WO
Inventors: Philip A. Cooper; Lisette Cooper; Stewart Myers; G. David Forney, Jr.; Leonard L. Scott, Jr.; Benjamin Shectman; Raymond Leclair; Yongxiang Li
Original assignee: Thinking Investments, Inc.
Priority date: 2000-08-18
Filing date: 2001-08-17
Publication date: 2002-02-28
Also published as: JP2004519753A; EP1309925A1; AU2001283425A1

Description

GENERATING AND PROVIDING INFORMATION ABOUT EXPECTED FUTURE PRICES OF ASSETS AND VISUALIZATION OF ASSET INFORMATION

BACKGROUND

This invention relates to generating and providing information about expected future prices of assets, and to visualization of asset information.

Among the kinds of information available at web sites on the Internet are current and historical prices and volumes of stock transactions, prices of put or call options at specific strike prices and expiration dates for various stocks, and theoretical prices of put and call options that are derived using formulas such as the Black-Scholes formula. Some web sites give predictions by individual experts of the future prices or price ranges of specific stocks.

A call option gives the holder a right to buy an underlying marketable asset by an expiration date for a specified strike price. A put option gives an analogous right to sell an asset. Options are called derivative securities because they derive their values from the prices of the underlying assets. Examples of underlying assets are coφorate stock, commodity stock, and currency. The price of an option is sometimes called the premium.

People who buy and sell options are naturally interested in what appropriate prices might be for the options. One well-known formula for determining the prices for call and put options under idealized conditions is called the Black-Scholes formula. Black-Scholes provides an estimate of call or put prices for options having a defined expiration date, given a current price of the underlying asset, an interest rate, and the volatility rate

(sometimes simply called volatility) of the asset. Black-Scholes assumes constant interest rates and volatility, no arbitrage, and trading that is continuous over a specified price range.

Information about investment assets such as coφorate securities is often presented as tables of values or ratios of values for successive time periods.

Sometimes graphs or visualization devices are used to provide a more intuitive view of the information.

One on-line service, Morningstar.com, uses a scatter plot in its Morningstar Investment Radar, URL (htφ://screen.mormngstar.con InvestmentRadar/InvestmentRadar. html). Each point in the plot represents risk versus capitalization of an asset in a portfolio.

Another on-line facility, FalconEye, URL

(http://www.falconeye.com/falconeye/tracker/index.html), displays a periscope-like view of a simulated cloud formation that represents

a multi-dimensional density map of all 6000+ Nasdaq stocks, sorted in real-time by FalconEye Viz-Alerts™ (customizable indicators) that were created for the vertical and horizontal axes. Each stock is like a pixel on the screen and each color represents the density of stocks depicted in that section of the Tracker Live Map. ... The distribution of density allows you to instantly see the real-time technical pressures on the market and gives you the knowledge to trade more efficiently and productively. ValuEngine, URL (http://valuengine.con_^servlet/ValuationSummary#), displays graphs of stock prices that include historical prices to a current date followed by forecast price trends for future periods, including forecast ranges above and below the forecast price trends.

Attorney Docket 11910-002001 SUMMARY

In general, in one aspect, the invention features a method in which data is received that represents current prices of options on a given asset. An estimate is derived from the data of a corresponding implied probability distribution of the price of the asset at a future time. Information about the probability distribution is made available within a time frame that is useful to investors, for example, promptly after the current option price information becomes available.

Implementations of the invention may include one or more of the following features. The data may represent a finite number of prices of options at spaced-apart strike prices of the asset. A set of first differences may be calculated of the finite number of prices to form an estimate of the cumulative probability distribution of the price of the asset at a future time. A set of second differences may be calculated of the finite number of strike prices from the set of first differences to form the estimate of the probability distribution function of the price of the asset at a future time.

In general, in another aspect, the invention features a method in which a real time data feed is provided that contains information based on the probability distribution.

In general, in another aspect, the invention features a method that includes providing a graphical user interface for viewing pages containing financial information related to an asset; and when a user indicates an asset of interest, displaying probability information related to the price of the asset at a future time.

In general, in another aspect, the invention features a method that includes receiving data representing current prices of options on a given asset, the options being associated with spaced-apart strike prices of the asset at a

Attorney Docket 11910-002001 future time. The data includes shifted current prices of options resulting from a shifted underlying price of the asset, the amoimt by which the asset price has shifted being different from the amount by which the strike prices are spaced apart. An estimate is derived from a quantized implied probability distribution of the price of the asset at a future time, the elements of the quantized probability distribution being more finely spaced than for a probability distribution derived without the shifted current price data.

In general, in another aspect, the invention includes deriving from said data an estimate of an implied probability distribution of the price of the asset at a future time, the mathematical derivation including a smoothing operation.

Implementations of the invention may include one or more of the following features. The smoothing operation may be performed in a volatility domain.

In general, in another aspect, the invention includes deriving a volatility for each of the future dates in accordance with a predetermined option pricing formula that links option prices with strike prices of the asset; and generating a smoothed and extrapolated volatility function.

Implementations of the invention may include one or more of the following features. The volatility function may be extrapolated to a wider range of dates than the future dates and to other strike prices. The smoothed volatility function may be applicable to conditions in which the data is reliable under a predetermined measure of reliability. The implied volatility function formula may have a quadratic form with two variables representing a strike price and an expiration date. The coefficients of the implied volatility function formula may be determined by applying

Attorney Docket 11910-002001 regression analysis to approximately fit the implied volatility function formula to each of the implied volatilities.

In general, in another aspect, the invention features a method that includes receiving data representing current prices of options on assets belonging to a portfolio, deriving from the data an estimate of an implied multivariate distribution of the price of a quantity at a future time that depends on the assets belonging to the portfolio, and making information about the probability distribution available within a time frame that is useful to investors.

In general, in another aspect, the invention features a method that includes receiving data representing values of a set of factors that influence a composite value, deriving from the data an estimate of an implied multivariate distribution of the price ofa quantity at a future time that depends on assets belonging to a portfolio, and making information about the probability distribution available within a time frame that is useful to investors.

Implementations of the invention may include one or more of the following features. The mathematical derivation may include generating a multivariate probability distribution function based on a correlation among the factors.

In general, in another aspect, the invention features a graphical user interface that includes a user interface element adapted to enable a user to indicate a future time, a user interface element adapted to show a current price of an asset, and a user interface element adapted to show the probability distribution of the price of the asset at the future time.

Attorney Docket 11910-002001 In general, in one aspect, the invention features, a method that includes continually generating current data that contains probability distributions of prices of assets at future times, continually feeding the current data to a recipient electronically, and the recipient using the fed data for services provided to users.

In general, in another aspect, the invention features a method that includes receiving data representing current prices of options on assets belonging to a portfolio, receiving data representing current prices of market transactions associated with a second portfolio of assets, and providing information electronically on the probability that the second portfolio of assets will reach a first value given the condition that the first portfolio of assets reaches a specified price at a future time.

In general, in another aspect, the invention features a method that includes receiving data representative of actual market transactions associated with a first portfolio of assets; receiving data representative of actual market transactions associated with a second portfolio of assets; and providing information on the expectation value of the price of first portfolio of assets given the condition that the second portfolio of assets reach a first specified price at a specified future time through a network.

In general, in another aspect, the invention features a method that includes evaluating an event defined by a first multivariate expression that represents a combination of macroeconomic variables at a time T, and estimating (e.g., using Monte Carlo techniques) the probability that a second multivariate expression that represents a combination of values of assets of a portfolio will have a value greater than a constant B at time T if the value of the first multivariate expression is greater than a constant A. The market variables represented by the first multivariate expression can

Attorney Docket 11910-002001 include macroeconomic factors (such as interest rates), market preferences regarding the style of company fundamentals (large/small companies, rapid/steady growth, etc.), or market preferences for industry sectors.

In general, in another aspect, the invention features a method that includes defining a regression expression that relates the value of one variable representing a combination of macroeconomic variables at time T to a second variable at time T that represents a combination of assets of a portfolio, and estimating the probability that the second variable will have a value greater than a constant B at time T if the value of the first variable is greater than a constant A at time T, based on the ratio of the probability of x being greater than A under the regression expression and the probability of x being greater than A.

In general, in another aspect, the invention features a method that includes defining a current value of an option as a quadratic expression that depends on the difference between the cureent price of the option and the current price of the underlying security, and using Monte Carlo techniques to estimate a probability distribution of the value at a future time T of a portfolio that includes the option.

The invention takes advantage of the realization that option prices for a given underlying asset are indicative of the market's prediction of the of the risk-neutral price of the underlying asset in the future (e.g., at the expiration of the option). Option price data may be used to derive the market's prediction in the form of an implied probability distribution of future risk-neutral prices. Additional explanation of the significance of the phrase risk-neutral is contained in the Appendix.

The implied probability distribution and other information related to it may be made easily available to people for whom the information may be

Attorney Docket 11910-002001 useful, such as those considering an investment in the underlying asset, or a brokerage firm advising such an investor.

In general, in another aspect, the invention features a method that includes (a) displaying to a user a circular visualization element having sectors arranged around a center of the element, the sectors respectively conesponding to different groups of assets, and (b) in each of the sectors, displaying an array of visual elements representative of respective assets belonging to the group to which the sector corresponds, the visual elements being arrayed with respect to distance from the center in accordance with magnitudes of performance of the assets during a recent period.

Implementations of the invention may include one or more of the following features. The visual elements comprise displayed dots, one for each of the assets. The visual elements exhibit visible characteristics that correspond to categories of the assets within the group. The categories of the assets within the group corcespond to different capitalizations. The dots are arranged along a radius of the sector to which they belong. Dots that would otherwise lie on the radius at a given distance from the center are displayed at different angular positions near to the radius. Each sector has an angular extent that represents the fraction of asset items in the sector relative to the total number of asset items in the universe being plotted. The circular visualization element is subdivided into rings having respectively different distances from the center. The rings are displayed in different colors. The magnitudes of performance of the assets are measured in percentage price change. The recent period comprises a trading day on an asset market. The assets comprise securities issued by coφorations.

Attorney Docket 11910-002001 In general, in another aspect, the invention features a method that includes displaying to a user a visualization element that indicates the odds of a performance measure of an asset being within specified ranges of identified values of the performance measure at a succession of times in the future.

Implementations of the invention may include one or more of the following features. The performance measure comprises a price of the asset or a return percentage or a tax-adjusted return percentage. The visualization element includes stripes superimposed on a graph of the performance measure over time, each of the stripes representing one of the specified ranges. Each of the stripes begins at a current time and becomes broader as it extends to future times. A graphical device shows actual historical values of the performance measure, e.g. in the form ofa line graph one end of which joins the visualization element at a point that represents a current date. The visualization element includes two portions, one of the portions representing the odds prior to a specified date based on one assumption, the other of the portions representing the odds after the specified date based on another assumption. The specified date is a date on which tax effects change from the one assumption to the other assumption.

In general, in another aspect, the invention features a method that includes displaying to a user a visualization element having graphical indicators of the relative performance of a selected asset compared with the performance of groups of assets in each of a succession of time periods, each of the groups comprising assets representing a common style. The relative performance is determined using an asset class factor model.

Among the advantages of the invention are one or more of the following: Investors and prospective investors in an underlying asset, such as a

Attorney Docket 11910-002001 publicly-traded stock, are given access to a key additional piece of cunent information, namely calculated data representing the market's view of the future price of the stock. Brokerage firms, investment advisors, and other companies involved in the securities markets are able to provide the information or related services to their clients and customers. A user is enabled to quickly visualize and grasp the significance of data that would otherwise be more difficult to understand.

Other features and advantages will become apparent from the following description and from the claims.

DESCRIPTION

Details of implementations of the invention are set forth in the figures and the related description below.

Figures 1, 2, and 3 are graphs.

Figure 4 is a block diagram.

Figures 5, 6, and 7 are web pages.

Figures 8 and 9 illustrate user interfaces.

Figure 10 shows data structures.

Figures 11 through 15 show visualization techniques.

hi general, the price of a call or put option is determined by buyers and sellers in the option market and carries information about the market's prediction of the expected price of the underlying asset at the expiration date. (The information does not include the premium that investors require for bearing risk, which must be estimated separately. The average long-

Attorney Docket 11910-002001 term value of the risk premium is about 6% per year for all stocks and may be adjusted for an individual stock's historical responsiveness to broader market movements.)

The information carried in the prices of options having various strike prices and expirations is used to derive probability distributions of the asset's price at future times and to display conesponding information to investors, for example, on the World Wide Web.

Basic method

We first define some relevant terms. We define x as the strike price, c(x) as the theoretical call price function (the price of the call as a function of strike price), p(x) as the theoretical put price function, F(x) as the cumulative distribution function (cdf) of the price of the underlying asset at expiration; and/Tx) as the probability density function (pdf) of the asset price at expiration. By definition, f(x) = F'(x) (i.e., the probability density function is the derivative of the cumulative distribution function).

The relationship between c(x),p(x),f(x), and F(x) can be succinctly stated as:

F(x) = c'(x) + 1 =p'(x); (la)

J\x) = c"(x) =p"(x). (lb) In words, the pdf is the second derivative of either the call price function or of the put price function A simple proof these relationships is given in the Appendix. The Appendix also contains other detailed information relating to the features of the invention.

This so-called "second-derivative method" for computing implied probability distributions from option price data is known in the academic literature, but apparently not very well known. For example, the standard

Attorney Docket 11910-002001 textbook "Options, Futures, and Other Derivatives," by John C. Hull (Fourth Edition, 1999; Prentice-Hall) mentions implied probabilities, but not the second-derivative method. Perhaps the best reference that we have been able to find is J. C. Jackwerth and M. Rubinstein, "Recovering probability distributions from option prices," J. Finance, vol. 51, pp. 1611- 1631 (1996), which has only six prior references. This paper cites D. T. Breeden and R. H. Litzenberger, "Prices of state-contingent claims implicit in option prices," J. Business, vol. 51, pp. 631-650 (1978) as the originator ofa second-derivative method, although the latter paper nowhere mentions probabilities.

Approximating f(x) from finite bid and ask option prices

Equations (la) and (lb) are obtained by assuming that the variable x is continuous and ranges from 0 to infinity. In practice, options are usually traded within certain price ranges and only for certain price intervals (e.g., ranging from $ 110 to $ 180 at $5 intervals). Thus, the call and/or put option prices are known only for a finite subset of strike prices. Under such circumstances, estimates of Equations (la) and (lb) can be computed by taking differences instead of derivatives as follows.

We assume that the option prices c(x) andp(x) are quoted for a finite subset of equally-spaced strike prices x = n Δ, where n is an integer, and Δ is the spacing between quoted prices. Define c„ = c(n Δ),p_n =p(n Δ). Then the first derivatives c'(x) and , '(x) at = (n+ ^lA)Δ. may be estimated by the first differences:

^Cn+\ c„

-n+l/Σ: (7a)

Δ

Attorney Docket 11910-002001 The conesponding estimates of the cumulative distribution function: ₊m = F((n + ϊ) ) are

The second derivatives c"(x) and p"(x) at x = n Δ may likewise be estimated by the second differences, i.e., differences of the estimates of the first derivatives:

^Cn+\/2 ^C n-\/2 "n+l - 2c„ + c n, -l cL ' = (9a)

Either of these estimates of the second derivatives may be used as an estimate of the probability density values at x = n Δ, i.e., flnΔ) :

f = P r f_n = c: (10)

Moreover, the market prices of call and put options are usually given in terms of a bid-ask spread, and thus either the bid price or the ask price (or some intermediate value) may be used as the call or put option price. By using the bid and ask prices for both the call option and the put option, four estimates of F(x) and/fx may be obtained. These estimates may be combined according to their reliability in any desired way. For example, one might use the estimate derived from the put bid price curve for values of j less than the cunent price s of the underlying asset, and the estimate derived from the call bid price curve for values of x greater than .

Attorney Docket 11910-002001 Examples of c„, p_n, F_n+l/2 , and /„ are shown in figures 1, 2, and 3 using the data of TABLE 1 (see below).

Tabular data

TABLE 1 below shows sample bid prices of call and put options for strike prices of an asset ranging from $110 to $180 at $5 intervals and the cumulative distribution values F_n+V2 and probability density values f„ computed according to Equations (7)-(10) above.

In the table, the values for E_n+1/2 conespond to strike prices that are midway between the two strike prices used to compute F_n+l/2. Thus, the cumulative distribution value shown to the right of the strike price $ 110 actually conesponds to the strike price $112.5, and the value to the right of the strike price $115 actually conesponds to strike price $117.5, and so forth.

Attorney Docket 11910-002001

Dynamic estimates for F(x) and f(x).

In Equations (7)-(10), the call and put option prices were assumed to be static in the calculation of the cumulative distribution function F(x) and probability density function/fo) for a finite subset of strike prices x = n Δ. In the real world, the price s of the underlying asset changes with time, and there will be a conesponding change in option prices. As a first order approximation, if the price s increases by a small amount δ, then the option price curves will effectively shift to the right by the amount δ. (Here, δ may be either positive or negative. For a more precise discussion of the shift, see the Appendix.) As a result, the price c(x) oτp(x) now quoted at strike price x may be used as an estimate for the option price on

Attorney Docket 11910-002001 the previous price curve at strike price x' = x - o. As a result, the prices on the previous curve at a new discrete subset of strike prices x = «Δ - δ become effectively visible. Given enough movements of the underlying price, therefore, we can effectively compute estimates of c(x), p(χ), F(x) and/or f(x) for a subset of strike prices x that is much more closely spaced than the subset available at any one time.

Extrapolating and smoothing probability distributions.

In a typical options market, the option prices are available only for certain expiration dates. In addition, the option prices are more reliable for options that are actively traded, which are typically nearer-term options at strike prices near the underlying price. It is therefore desirable to extrapolate and inteφolate probability distributions to times other than actual expiration dates and to wider ranges of strike prices.

Any standard extrapolation and smoothing techniques may be used directly on the cumulative distribution values F_n+υ2 or probability density values „ to give a smoothed and extrapolated estimate of F(x) orf(x). Similarly, given such estimated curves for a discrete subset of future times T, standard inteφolation and extrapolation techniques may be used to estimate such curves for other specified values of T, or for a continuous range of T> 0.

A less direct but useful approach is to perform extrapolation and smoothing on an implied volatility function, which is then used to calculate the other functions, such as c(x),p(x), F(x), and fix). The volatility rate of an asset (often simply called its volatility) is a measure of uncertainty about the returns provided by the asset. The volatility rates of a stock may typically be in the range of 0.3 to 0.5 per year.

Attorney Docket 11910-002001 An advantage of performing extrapolation and smoothing on implied volatility curves is that different types of volatility curves (so-called "volatility smiles") are known and can be used as a guide to the extrapolation and smoothing process to prevent "overfitting" of certain unreliable data points.

The standard method of computing implied volatilities is to invert the Black-Scholes pricing formula (see Appendix) for the actual call price c(x) or put price p(x) of an underlying asset at a given strike price x, given the underlying price 5 (current price of asset), risk-free rate of interest r, and and T (expiration date). When this is done for a range of values of x, an estimate of an implied volatility curve σ(x) is obtained. This curve may be smoothed and extrapolated by any standard method to give a smoothed curve σ (x). Then conesponding smoothed put and call price curves may be computed using the Black-Scholes pricing formula and differentiated once or twice to give a smoothed cdf or pdf. Finally, given such estimated curves for a discrete subset of future times T, standard inteφolation and extrapolation techniques may be used to estimate such curves for other specified values of T, or for a continuous range of T > 0.

Another new way to compute implied volatilities is first to compute a finite subset of cdf values F_n+l/2 and then to invert the Black-Scholes cdf formula (see Appendix) at these values. When this is done for a range of values of x, an estimate of a generally different implied volatility curve σ_\(x) is obtained, called the cdf-implied volatility curve. Again, this curve may be smoothed and extrapolated by any standard method to give a smoothed curve σ _\(x). Then a conesponding smoothed cdf may be computed from the Black-Scholes cdf formula, and differentiated once to give a smoothed pdf. Finally, again, given such estimated curves for a discrete subset of future times T, standard inteφolation and extrapolation

Attorney Docket 11910-002001 techniques may be used to estimate such curves for other specified values of T, or for a continuous range of T> 0.

Some advantages of using the cdf-implied volatility curve rather than the conventional implied volatility curve are that the computations are simpler, at least from an estimate of F(x), and that it fits better with the multivariate techniques to be discussed below.

A particular method for finding a smoothed and extrapolated implied volatility curve σ _\ (x,T) as a function of both strike price x and time T to expiration is as follows. The volatility curve is assumed to be approximated by a quadratic formula

σ \ (x, 1) = Ωo + i\ x + 02 x² + #3 T + α₄ ι + α₅ x T, (14)

The coefficients {α,} are determined by regression to fit the available data regarding σ_\(x, T) as closely as possible. Given the smoothed curve σ _\ (x, T), conesponding smoothed cdfs for different x's and T's) may be computed from the Black-Scholes cdf formula for each time T, and differentiated once to give a smoothed pdf. An alternative procedure, with numerical advantages, is to use a quadratic fit like the above for a function σ (x,T), and then invert the Black-Scholes cdf to find σ _\ (x, 1). See the Appendix for the academic history of such approximations of σ (x,T). Another useful variation is to fit σ (x, T) with a quadratic function of x at times T which are specific expiration dates, then linearly inteφolate at other times T.

Treatment of multiple assets

The techniques described so far give probability distributions for the future values of a single asset based on option price data for that asset.

However, in many cases an investor may be concerned with multiple

Attorney Docket 11910-002001 assets, for example all of the stocks in his or her portfolio, or in a mutual fund, or in a certain index. Moreover, the investor may be concerned with the relations between one group of assets and another.

A general method for dealing with such questions is to generate multivariate probability distributions for all assets of interest. A multivariate cdf may be written as E(x_ls x₂, ..., x„), where the variables (x_ls x , ..., x„) are the values of the n assets of interest.

We will assume that we know from the techniques described above or otherwise the marginal cdfs F[(x) for each of the individual variables. As a first step, we may define for each X_J a function yi(xø, called a "waφing function," such that y_$x is a standard normal (Gaussian) variable with mean 0 and variance 1. This is simply done by defining i(xt such that F(x_$ = N(y_$x])) for all values of x, where N(x) denotes the cdf of a standard normal variable. The function Vi( i) may be simply described in terms of σ_x(x_t). See the Appendix. Under mild technical conditions such as having a marginal cdf that varies monotonically, such a waφing function y\(x) has a well-defined inverse waφing function Xi(y_$.

Second, we assume that we can find the historical pairwise conelations between the waφed standard normal variables y$x). These conelations may be computed by standard techniques from any available set of historical asset price data. We denote by C the n x n conelation matrix whose entries are these historically-based conelations. Because each of the variables y_\(x_$ is standard normal, the diagonal terms of C are all equal to l.

Now let

• • . , x_n) denote the cdf of a multivariate Gaussian random n - tuple with zero mean and covariance matrix C. Define

Attorney Docket 11910-002001 E(x_l5 χ₂, ..., x„) =

y₂(χz), ■■ ■, yn(χ))

Then F(xι, x₂, ..., x_n) is a multivariate cdf that (a) has the conect (given) marginal cdfs F_\(x); and (b) has the conect (historical) conelations between the waφed standard normal variables yi(x). We use this cdf to answer questions involving the variables (x_\, x , ..., x_n).

For example, the investor might have a portfolio consisting ofa given quantity of each of these assets. The value of such a portfolio is the sum

X = h_\ X_\ + ?₂ x₂ + ... + h_n x_n, (15)

where A,- represents the quantity of the z^'th asset in the portfolio. The investor might be interested in an estimate of the probability distribution of the value x of the whole portfolio.

Such an estimate may be obtained by Monte Carlo simulation. For such a simulation, a large number N of samples from the multivariate Gaussian cdf Fdyi, ..., v_n) may be generated. Each sample (y\, ...,y_n) may be converted to a sample (x_\ , x₂, ... , x_n) by using the inverse waφing functions X{(y_\). The value x of the total portfolio may then be computed for each sample. From these N values of x, the probability distribution of x (e.g., its cdf F(x)) may be estimated.

In practice, it is useful to save the N multivariate samples in a large database. Then the cdf of any quantity whose value is a function of the variables (x_ls x₂, ..., x„) may be estimated from this database. For example, if the investor would like to know the cdf of some alternative portfolio with different quantities of each asset, this can be quickly determined from the stored database.

Attorney Docket 11910-002001 An investor may also determine the effect of one portfolio (or event(s) or variables such as interest rates, P/E ratios, public interest in a certain sector of the market) on another portfolio as follows. Assume that the first portfolio is represented by x, where

x = h_\ x_\ + hi x₂ + ... + h_n x_n, (30)

where each x, may be viewed as the price of a portfolio component, and the second portfolio is represented by y, where

y = g\ X\ + g2 X2 + ... + g_n Xn- (31)

where each y,- may be viewed as the price of a portfolio component or more broadly as any macro-economic variable (macroeconomic, fundamental, or sector related).

Consider the "what-if ' question: letting A and B be given positive constants, if x > A at time T, what is the probability that v > B at time T. This question can be answered by creating a Monte Carlo database as above for the multivariate cdf E(x_ls x₂, ..., x_n) conesponding to time T, identifying those samples for which x ≥A, and then using only these samples to estimate the probability that v > B. More generally, any conditional cdf of the form F(x | E) can be estimated similarly, where x is any function of the variables (x_\, x , ..., x„) and E is any event defined in terms of the variables (x_\ , x₂, ... , x«).

Similarly, suppose an investor would like to know whether it is reasonable to believe that a certain stock or portfolio x will have a value greater than a given constant A at time T. This kind of question can be addressed by estimating the conditional cdf of some other related and perhaps better- understood variable (or combination of variables) y at time T, given that x

Attorney Docket 11910-002001 ≥A. If the resulting distribution for does not look reasonable, then the investor may conclude that it is unreasonable to expect that x ≥A.

Applications that use the probability distribution information

A wide variety of techniques may be used to accumulate and process the information needed for the calculations described above and to provide the information to users directly or indirectly through third parties. Some of these techniques are described below.

As shown in figure 4, the probability distribution information can be provided to users from a host server 102 connected to a communication network 104, for example, a public network such as the Internet or a private network such as a coφorate intranet or local area network (LAN). For puφose of illustration, the following discussion assumes that network 104 is the Internet. \

The host server 102 includes a software suite 116, a financial database 120, and a communications module 122. The communications module 122 transmits and receives data generated by the host server 102 according to the communication protocols of the network 104.

Also connected to the network are one or more of each of the following (only one is shown in each case): an individual or institutional user 108, an advertisement provider 110, a financial institution 112, a third party web server 114, a media operator 122, and a financial information provider 106.

The operator of the host server could be, for example, a financial information source, a private company, a vendor of investment services, or a consortium of companies that provides a centralized database of information.

Attorney Docket 11910-002001 The host server 102 runs typical operating system and web server programs that are part of the software suite 116. The web server programs allow the host server 102 to operate as a web server and generate web pages or elements of web pages, e.g., in HTML or XML code, that allow each user 108 to receive and interact with probability distribution information generated by the host server.

Software suite 116 also includes analytical software 118 that is configured to analyze data stored in the financial database 120 to generate, for example, the implied probability distribution of future prices of assets and portfolios.

The financial database 120 stores financial information collected from the financial information providers 106 and computation results generated by the analytical software 118. The financial information providers 106 is connected to the network 104 via a commumcation link 126 or the financial information providers may feed the information directly to the host server through a dialup or dedicated line (not shown).

Figure 4 gives a functional view of an implementation of the invention. Structurally, the host server could be implemented as one or more web servers coupled to the network, one or more applications servers running the analytical software and other applications required for the system and one or more database servers that would store the financial database and other information required for the system.

Figure 10 shows an example of a data feed 150 sent from the financial information provider 106 to the host server 102 through the communication link 126. Information is communicated to the host server in the form of messages 151, 152. Each message contains a stream of one or more records 153 each of which carries information about option prices

Attorney Docket 11910-002001 for an underlying asset. Each message includes header information 154 that identifies the sender and receiver, the current date 155, and an end of message indicator 158, which follows the records contained in the message.

Each record 153 in the stream includes an identifier 156 (e.g., the trading symbol) of an underlying asset, an indication 158 of whether the record pertains to a put or call, the strike date 160 of the put or call, the strike price 162 of the put or call, current bid-ask prices 164 of the underlying asset, bid-ask prices 166 for the option, and transaction volumes 168 associated with the option. The financial information provider 106 may be an information broker, such as Reuters, Bridge, or Bloomberg, or any other party that has access to or can generate the information carried in the messages. The broker may provide information from sources that include, for example, the New York Stock Exchange and the Chicago Board of Options Exchange.

The financial database 120 stores the information received in the information feed from the financial information providers and other information, including, for example, interest rates and volatilities. The financial database also stores the results generated by the analytical software, including probability distribution functions with respect to the underlying assets and assets that are not the subject of options.

The probability distribution information is generated continually (and essentially in real time) from the incoming options data so that the information provided and displayed to users is current. That is, the information is not based on old historical data but rather on current information about option prices.

Attorney Docket 11910-002001 In addition, other soft information can be accumulated, stored, and provided to users, including fundamental characteristics of the underlying assets, including prices, volatility values, beta, the identification of the industry to which the asset belongs, the yield, the price to book ratio, and the leverage. Other information could include calendars of earnings forecast dates, earnings forecasts, coφorate action items, news items that relate to an industry, and the volume of institutional holdings.

The messages from the information provider 106 may be sent in response to requests by the host server 102, the information may be sent to the host server 102 automatically at a specified time interval, or the information may be sent as received by the information provider from its sources. The financial database 120 may be maintained on a separate server computer (not shown) that is dedicated to the collection and organization of financial data. The financial database is organized to provide logical relationships among the stored data and to make retrieval of needed information rapid and effective.

The user 108 may use, for example, a personal computer, a TV set top box, a personal digital assistant (PDA), or a portable phone to communicate with the network 104. Any of these devices may be running an Internet browser to display the graphical user interface (GUI) generated by the host server 102.

The host server 102 may provide probability distribution information on the network 104 in the form of web pages and allow the individual user 108, the financial institution 112, the third party web server 114, and the media operator 124 to view the information freely. The host company that runs the host server 102 may generate revenue by, for example, selling advertisement space on its web pages to an advertisement provider 110.

Attorney Docket 11910-002001 The host server 102 may also provide proprietary information and enhanced services to individual users 108, financial institutions 112, third party web servers 114, and media operators 122 for a subscription fee.

The host server 102 may have a direct link to the financial institutions 112 to provide tailored information in a format that can be readily incoφorated into the databases of the financial institutions 112. Financial institutions 112 may include, for example, investment banks, stock brokerage firms, mutual fund providers, bank trust departments, investment advisers, and venture capital investment firms. These institutions may incoφorate the probability distribution information generated by the analytical software 118 into the financial services that they provide to their own subscribers. The probability distribution information provided by the host server 102 enables the stock brokerage firms to provide better advice to their customers.

A third party web server 114 may incoφorate probability distribution information into its web site. The information may be delivered in the form of an information feed to the third party host of web server 114 either through the Internet or through a dedicated or dial-up connection.

Figure 10 shows an example of a data feed 182 sent from the host server 102 to the third party web server 114 through communication link 128.

Data feed 182 carries messages 184 that include header information 186, identifying the sender and receiver, and records 188 that relate to specific underlying assets.

Each record 188 includes an item 190 that identifies a future date, a symbol 192 identifying the asset, risk-neutral probability density information 193 and cumulative distribution information 194. The record could also include a symbol identifying a second asset 195 with respect to

Attorney Docket 11910-002001 the identified future date, and so on. Other information could be provided such as a risk premium value with respect to the risk-neutral values.

Examples of third party web servers 114 are the web servers of E*TRADE, CBS MarketWatch, Fidelity Investments, and The Wall Street Journal. The third party web server 114 specifies a list of assets for which it needs probability distribution information. Host server 102 periodically gathers information from financial information provider 106 and its own financial database 120, generates the probability distribution information for the specified list of assets, and transmits the information to the third party web server 114 for incoφoration into its web pages.

Examples of the media operator 124 are cable TV operators and newspaper agencies that provide financial information. For example, a cable TV channel that provides stock price quotes may also provide probability distribution information generated by the host server 102. A cable TV operator may have a database that stores the probability distributions of all the stocks that are listed on the NYSE for a number of months into the future. The host server 102 may periodically send updated information to the database of the cable TV operator. When a subscriber of the cable TV channel views the stock price quotes on a TV, the subscriber may send commands to a server computer to the cable TV operator via modem to specify a particular stock and a particular future date. In response, the server computer of the cable TV operator retrieves the probability distribution information from its database and sends the information to the subscriber via the cable network, e.g., by encoding the probability distribution information in the vertical blank interval of the TV signal.

Attorney Docket 11910-002001 Likewise, a newspaper agency that provides daily transaction price quotes may also provide the probabilities of stock prices rising above certain percentages of the current asset prices at a predetermined future date, e.g., 6 months. A sample listing on a newspaper may be "AMD 83 88 85 A 40%", meaning that the AMD stock has a lowest price of $83, highest price of $88, a closing price of $85 that is higher than the previous closing price, and a 40% probability of rising 10% in 6 months.

The analytical software 118 may be written in any computer language such as Java, C, C++, or FORTRAN. The software may include the following modules: ( 1 ) input module for preprocessing data received from the financial data sources; (2) computation module for performing the mathematical analyses; (3) user interface module for generating a graphical interface to receive inputs from the user and to display charts and graphs of the computation results; and (4) communications interface module for handling the communications protocols required for accessing the networks.

Web pages and user interfaces

A variety of web pages and user interfaces can be used to convey the information generated by the techniques described above.

For example, referring to figure 5, a GUI 700 enables a user 108 to obtain a range of financial services provided by the host server 102. The user 108 may see the implied probabilities of future prices of marketable assets 706 having symbols 704 and cunent prices 708. The information displayed could include the probabilities 714 (or 718) of the asset prices rising above a certain specified percentage 712 (or falling below a certain specified percentage 716) of the reference price 710 within a specified period of time 720.

Attorney Docket 11910-002001 For the convenience of the user 108, GUI 700 includes links 730 to institutions that facilitate trading of the assets. The host company that runs the host server 102 sells advertising space 728 on the GUI 700 to obtain revenue. The GUI 700 also has links 726 to other services provided by the host server 102, including providing advice on lifetime financial management, on-line courses on topics related to trading of marketable assets, research on market conditions related to marketable assets, and management of portfolios of assets.

Referring to figure 6, the GUI 700 also may display an interactive web page to allow the user 108 to view the market's cunent prediction of future values of portfolios of assets. The past market price 734 and cunent market price 736 of the asset portfolios 732 are displayed. Also displayed is the price difference 738. The GUI 700 displays the probability 744 (or 746) that the portfolio 732 will gain (or lose) a certain percentage 740 within a specified time period 742. Examples of portfolios include stock portfolios, retirement 401K plans, and individual retirement accounts. Links 748 are provided to allow the user 108 to view the market's cunent forecast of future price trends of the individual assets within each portfolio.

Referring to figure 7, in another user interface, the GUI 700 displays an interactive web page that includes detailed analyses of past price history and the market's current forecast of the probability distribution of the future values ofa marketable asset over a specified period of time. The GUI 700 includes price-spread displays 750 representing the cumulative distribution values of the predicted future prices of an asset over periods of time. The price-spread display 750a shows the price distribution data that was generated at a time three months earlier. A three-month history of the actual asset prices is shown as a line graph for comparison to give the user

Attorney Docket 11910-002001 108 a measure of the merit of the price distribution information. The price- spread display 750b represents the predicted cumulative distribution values of the asset prices over a period of one month into the future. The left edge of display 750b, of course, begins at the actual price of the asset as of the end of the prior three-month period, e.g., the cunent DELL stock price of $50. The probability distribution information implies, for example, a 1% probability that the stock price will fall below $35, and a 99% probably that the stock price will fall below $80 in one month. GUI 700 includes table 752 that shows highlights of asset information and graph 754 that shows sector risks of the asset. A box 755 permits a user to enter a target price and table 757 presents the probability of that price at four different future times, based on the calculated implied probability distributions.

Referring to figure 8, in another approach, a window 402 is displayed on a user's screen showing financial information along with two other windows 408 and 410 showing probability distribution information. The individual user 108 could have previously downloaded a client program from the host server 102. When the user is viewing any document, e.g., any web page (whether of the host server 102 or of another host's server), the user may highlight a stock symbol 404 using a pointer 406 and type a predetermined keystroke (e.g., "ALT-SHIFT-Q") to invoke the client program. The client program then sends the stock symbol as highlighted by the user to the host server 102. The host server 102 sends probability distribution information back to the client program, which in turn displays the information in separate windows 408 and 410.

When the client program is invoked, a window 422 may be displayed showing the different types of price information that can be displayed. In the example shown, the "Probability distribution curve" and "Upper/lower

Attorney Docket 11910-002001 estimate curves" are selected. Window 408 shows the price range of AMD stock above and below a strike price of $140 from July to December, with 90% probability that the stock price will fall between the upper and lower estimate curves. Window 410 shows the probability density curve fix) for AMD stock for a future date of 8/15/2000. The user may also specify a default function curve, such that whenever an asset name is highlighted, the default function curve is displayed without any further instruction from the user.

Tabular data such as those shown in TABLE 1 may be generated by the host server 102 and transmitted over the network 104 to devices that have limited capability for displaying graphical data. As an example, the individual user 108 may wish to access asset probability distribution information using a portable phone. The user enters commands using the phone keypad to specify a stock, a price, and a future date. In response, the host server 102 returns the probability of the stock reaching the specified price at the specified future date in tabular format suitable for display on the portable phone screen.

Referring to figure 9, a portable phone 500 includes a display screen 502, numeric keys 506, and scrolling keys 504. A user may enter commands using the numeric keys 506. Price information received from the host server 102 is displayed on the display screen 502. Tabular data typically includes a long list of numbers, and the user may use the scroll keys 504 to view different portions of the tabular data.

In the example shown in display screen 502, the AMD stock has a cunent price of $82. The cumulative distribution values E(x) for various future prices on 8/15/2000 are listed. The distribution indicates a 40% probability that the stock price will be below $80 implying a 60% probability of the

Attorney Docket 11910-002001 stock price being above $80. Likewise, the distribution indicates an 80% probability that the stock price will be at least $90, implying a probability of 20% of the stock price being above $90.

The visualization techniques discussed below are useful in enabling users to visualize and quickly understand information that relates to assets.

Visualization of implied probability distributions of future prices

As shown in figure 11, a visualization device 10 displays cumulative probability distribution values of predicted relative future prices of Dell Computer Coφoration stock with respect to a cunent date 12 of July 1, 2000. The price 14 on July 1 , 2000, is shown as being $41 lower than the price 16 on February 1, 2000, which itself is set at an arbitrary starting value of $0 for purposes of display. The display could be provide in actual price terms, as a price change, or in terms of percentage return. The probability distribution data on which the visualization device 10 is based may be generated by, for example, the method discussed in the parent patent application.

The predicted cumulative distribution values of the prices of Dell stock over a period of several months into the future are illustrated by an envelope 16 that begins at a point 18 and opens to the right

The envelope 16 is divided into stripes 22, 24, 26, 28, 30, each of which also begins at point 18 and opens to the right. Stripe 22, for example, indicates a range of prices (all of which are below the current price) at each date in the future and indicates the predicted odds (10%) that the price will fall within that stripe. Similarly, stripe 26 indicates a range of prices (above and below the current price) with an expected 40% odds of occurring on various dates in the future. The odds of falling either above

Attorney Docket 11910-002001 or below envelope 16 are, as indicated, less than 1%. Each stripe is displayed in a different color, and the colors are chosen to permit a viewer to visualize the different stripes easily.

A similar envelope 32 starts at the nominal $0 price on February 1, 2000, and ends on the cunent date. Envelope 32 represents the cumulative distribution values of the prices of Dell stock that were predicted as of February 1, 2001. The actual price history of Dell stock between February 1, 2000, and the cunent date is illustrated by the line 34. The extent to which the actual price history of line 34 matches the predicted cumulative distribution values gives a visual indicator to the user of the validity of the prediction approach.

The combination of color, text, and data illustrated in figure 11 enable an investor to assess the performance of an asset over time relative to his price expectations.

The visualization device of figure 11 is also useful for assets other than stocks, including mutual funds, and for portfolios of assets.

Figure 12 presents information similar to figure 11, but is expressed with respect to projected return percentage rather than price. The example shown in figure 12 relates to Checkpoint Software Technologies Ltd stock as of a cunent date 66 of October 24, 2000. The x-axis represents return percentage with respect to a start date. Line 62 shows the historical return with respect to the stock price on the start date of January 1, 2000 at point 67. On the current date 66, the cumulative return on the price of the stock since start point 67 is approximately 200%.

An envelope 68 starts at point 66 and opens to the right. The envelope 68 illustrates the projected odds of the percentage return being within certain

Attorney Docket 11910-002001 ranges on each day for several months into the future relative to the original start point 66. The ranges are expressed as stripes 52, 54, 56, 58, and 60. The envelope and stripes are centered on a trend line 50 that has a slightly positive slope to reflect the probability of future price levels generated by a mathematical algorithm that is based on the implied volatility of the options market. The algorithm is described in the related pending United States patent application 09/641,589, filed 08/18/2000.

For example, the projected odds that the return (relative to the start point 67) will be between 50 and 100% on May 1, 2000, is 10%.

The same kind of data used to generate the display of figure 11 is used to generate the device of figure 12 except that the data is processed to convert the price data into change of price data for plotting along the x- axis.

Figure 13 is similar to figure 12, except that the effect of the occurrence of the long-term capital gain tax rate transition (identified as the vertical line 80 that is one year after the start date 82). After the date represented by line 80, any sale of the stock would produce a lower tax impact and a higher effective return, than under the assumption of short-term capital gain tax rate, prior to that date. For that reason, the envelope 84 is shifted upward and exploded for periods after the transition date.

Visualization of asset style

Figure 14 shows another visualization device that reflects an asset fund style analysis that evaluates an asset fund (e.g., a mutual fund) by comparing its historical returns to those achieved by a set of basic asset classes (e.g., cash, bonds, large-cap growth stocks, large-cap value stocks).

Attorney Docket 11910-002001 The first step of the style analysis is a one-time selection of basic asset classes, which should be mutually exclusive and exhaustive, to represent all asset types of interest. In one example of classes (listed below) there are seventeen market indices, seven of which represent stocks and the remainder of which represent bonds.

The second step of style analysis determines the exposure ofa given mutual fund to these indices. This is achieved by solving an asset class factor model, in which a fund return is expressed as a linear combination of returns from basic asset classes plus a residual. The exposures are determined by minimizing the variance of residuals using one-year weekly data. It is believed that one-year weekly data can reflect a fund style more accurately. In addition, fund exposures to basic asset classes are constrained to be non-negative and to sum to one.

The third step of style analysis is to present the results in a form that provides meaningful investment information. Style analysis results for a given fund consist of percentages in each basic asset class, with the dominant percentages determining the fund's style. Style drift for a given fund is based on determining style changes over the most recent five years.

In figure 14, the results of the analysis are displayed. The colors of the respective cells 102 indicate how much of the fund's performance is explained by regression to the style associated with the row in which the cell appears, during the period represented by the column in which the cell appears.

The example shown in figure 14 identifies each of seventeen indices

(styles 100) that are of interest to a broad group of individual investors. For example, the style LG refers to a set of stocks that are characterized as

Attorney Docket 11910-002001 Large Capitalization Growth. The full list of groups in this example follows:

1. Large-Cap Growth (LG)

2. Large-Cap Value (LV)

3. Mid-Cap (MC)

4. Small-Cap (SC)

5. European Stocks (EU)

6. Japanese Stocks (JP)

7. Emerging Markets (EM)

8. Cash (TB)

9. Intermediate Government Bonds (GI)

10. Long-term Government Bonds (GL)

11. Intermediate Coφorate Bonds (CI)

12. Long-term Coφorate Bonds (CL)

13. Coφ Junk Bonds (HY)

14. Mortgages-Backed Securities (MG)

15. Real Estate (RE)

16. Municipal Bonds (MU)

17. Global Bonds (GG)

Attorney Docket 11910-002001 Thus, for cell 102, the regression indicates that about 45% of the fund's performance is conelated to the LG style for that period in 2000.

The values determined by the regression are displayed in a grid with style on the vertical axis and time on the horizontal axis. The color of each cell 102 indicates the percentage in accordance with the percentage scale shown on the right.

The resulting visualization device enables an investor to assess the performance of the asset over time relative to his investment preferences and strategy.

Visualization of recent market activity

The ability to track the activity of a market of assets (such as stocks or mutual funds) as the activity unfolds is of great interest to investors. Many investors rely on daily publications of tabular data that presents information such as volume, price change, asset identification, and performance.

The visualization device shown in figure 15 collects, condenses, and enhances such information in a way that improves the ability of an investor to visually and quickly grasp recent and cunent market activity.

The displays are updated continually and quickly throughout a trading day.

As shown in figure 15, the visualization device 120 includes a radar-like display that is divided into sectors 122 ananged around a central point 124. The device is also divided into rings 126 that are centered on point 124 and filled with different colors to distinguish the different rings visually.

Attorney Docket 11910-002001 Each of the sectors 122 is associated with an industry or sector of interest to investors, for example the technology sector or the financial sector. The size of each sector depends on the proportion of the asset items being displayed for the sector relative to the total number of asset items being depicted for the whole universe.

Each of the rings represents a different percentage of price change during a recent period (e.g., during a single trading day). The rings are ananged with the largest percentage decline near the middle of the radar and the largest percentage increase near the periphery.

Within each of the sectors, small dots 128 are displayed each representing a selected stock or asset within the industry sector represented by the radar sector. The distance of each dot from the central point 124 represents the percentage price change of the conesponding stock at a given time during a trading day. Gray dots represent small capitalization stocks; black dots represent large capitalization stocks.

When multiple stocks in a sector have the same percentage change (e.g., at location 130), the dots are displayed at different angular positions relative to the central point, to convey to the viewer an impression of the distribution of the percentage changes within each sector.

Implementation details

The visualization elements described above can be displayed on a wide range of devices, including desktop and laptop computers, personal digital assistants, portable telephones, publicly viewed large-screen displays, or closed circuit or broadcast/cable television monitors.

The visualization elements can be displayed alone or embedded in other displayed material, including other financial information, general news

Attorney Docket 11910-002001 information, or program material. For example, the elements can be displayed as part of a website page dedicated to financial information or as part of a general web portal page. The elements can be displayed as part of a broadcast or cable TV program.

The raw data from which the visualization elements are created may be obtained on-the-fly electronically and/or may be stored as needed either locally or centrally. Software that processes the raw data to generate the derived values to be represented in the visualization elements may run locally or may be run remotely (and then downloaded to a local display). Software that processes the derived values to produce the visualization elements may be handled similarly.

The raw data, the derived values, and the visualization elements can be updated more or less frequently, though in many cases real-time updates are especially useful.

Each of the visualization elements could be made interactive by enabling a user to provide inputs, for example, mouse clicks, that indicate how the user wishes to alter the manner in which the elements are displayed, or the selection of data contained in them. Configuration features can be provided to enable the user to configure what information he receives, in what form it is displayed, and how often and how cunently he receives it.

Other implementations are within the scope of the following claims.

For example, with respect to the visualization element shown in figure 15, the overall shape of the element could be other than round, the sectors could be other than simple pie shapes, the rings could be other than simple rings, the individual dots could be replaced by other icons, the dots or other icons could be anayed in other aπangements from the center, and

Attorney Docket 11910-002001 visible features other than color could be used to distinguish different portions of the display.

A wide range of variants is also possible with respect to the visualization elements shown in the figures.

Attorney Docket 11910-002001 Appendix A

Three aspects of the invention are:

1. The recognition of the desirability of displaying to a financial investment customer in real time, for example on a World Wide Web site, the probability distribution governing the price of a particular asset (e.g., a stock) at a selected future time.

2. The recognition that such probability distributions can be derived from option prices for that asset, or for related assets, which are readily available in real timer

3. The recognition that probability distributions involving several asset prices simultaneously are useful to investment customers in several contexts, especially in exploring hypothetical scenarios, and that single asset distributions such as (but not restricted to) the above can be meaningfully incorporated into multivariate distributions, manageably determined.

In this appendix we first describe a basic method for deriving probability distributions for single assets from option prices. We next describe improvements on this basic method to address various practical issues. Then we take up the multivariate case and show how to extend this kind of single asset price distribution, or any other, to the multivariate case. Finally, we consider a number of novel multivariate applications, with emphasis on scenario exploration.

1 Basic method

A call option is an option to buy an asset (e.g., a stock) at a certain price x (called the strike price) on a given expiration date T days in the future. (An option exercisable only on the expiration date is called a European-style option; for simplicity we will consider in this discussion only this type of option.¹ ) Similarly, a put option is an option to sell an asset at a strike price a. on a given expiration date. (The "European-style" assumption of no possible early exercise is more important here, but can also be ignored for puts that axe not too deeply "in the money.")

Let c(x) denote the price of a call option on an asset at strike price _c, and p(x) the price of a put option. Such prices are established by options market-makers. We have realized that such prices implicitly contain information about a "market view" of the probability distribution of the price of that asset at the expiration date.

In a simple but precise form, this market view can be stated as follows. Suppose that we were given the call price curve c(x) or the put price curve p(x) as a continuous function of the strike price x for all x > 0. Then, the second derivative of either the call or the put price curve is the market view of the risk-neutral probability density function (pdf) f(x) of the asset price at the expiration date. In other words, f(x) = d'(x) = p"(x).

The idea that option prices determine some kind of implied probability distribution is fairly well known in the financial literature. The idea that a pdf can be computed by taking the second derivative of a continuous option price curve is known in the academic literature, but it does not appear to be very well known. For example, the standard textbook "Options, Futures,

'Even allowing for possible early exercise, most liquidly traded call options without large dividends can be treated as if there were no possibility of such exercise, since sale of the option is usually a better alternative; therefore, these call options behave similarly to European-style options. and Other Derivatives." by John C. Hull (Fourth Edition, 1999; Prentice-Hall) mentions implied probabilities, but not the second-derivative method. The best reference that we have been able to find is J. C. Jackwerth and M. Rubinstein, "Recovering probabiUty distributions from option prices," J. Finance, vol. 51, pp. 1611-1631 (1996), which has only six prior references.

The risk-neutral distribution (at a fixed future time T, for a fixed asset) is defined as the price distribution that would hold if market participants were neutral to risk, which they generally are not. However, many asset pricing theories, such as those underlying Black-Scholes option theory and most of the variations found in the Hull book above, allow for the true risk-averse asset price distribution to be obtained from the risk-neutral distribution f(x) just by adjusting the latter by an appropriate risk premium: If there are no dividends, the true distribution is just f xe^^μ~r^), where μ — r is the expected annual return rate for the stock in excess of the risk free rate r. We use a variation on this simple format, sUghtly modified to allow for dividends (see below), though our invention could also work weU with a more complicated adjustment. In this format, a value for μ — r must stiU be suppUed. We use as a default the "consensus estimate" taken from the textbook "Active PortfoUo Management" (1995) by Grinold and Kahn. These authors note a long-term average value of the risk premium to be 6% per year, and suggest multiplying this number by the stock's beta to get μ — r. The parameter beta is the slope of the Une giving a regression of the stock in question against a market portfoUo, often taken as the S&P 500. This is the weU-known CAPM estimate for the expected excess return. Whether good or bad, its stature as a consensus estimate makes it suited to our aim of providing a market view, though it is only a default. Our invention, which provides the risk-neutral component of the probabilities, could work with other estimates for the risk-averse adjustment parameter μ — r and with any expUcit scheme for adjusting the risk neutral probabiUty density to the risk-averse probabiUty density. It is worth pointing out that, for shorter time periods-even a month or two-the risk adjustment required is small and generaUy overwhelmed by fluctuations in the risk-neutral distribution itself.

We give a brief proof that the second derivative procedure gives the correct risk-neutral probabiUty distribution. As in HuU, we may calculate the European call or put price as an expected value in the risk-neutral distribution.

If the actual value of the asset on the expiration date is v, then the value of a call option at strike price x is max{υ - x, 0}, and the value of a put option is max{_c - υ, 0}. If the actual value is a random variable with pdf f(v), then the expected value of a call option at x at the expiration date is "OO cτ(x) = ζ._v[max{v — x, 0}] = / (υ — x)f(υ) dυ, and the expected value of a put option at x at the expiration date is

pτ(x) = E„[max{x - υ, 0}] = / (x - υ)f(v) dυ.

Jo

The current values c(x) and p(x) may be obtained by discounting c (x) and pτ(z) by e^~rT, where r is the risk-free interest rate, but for our purposes, forecasting probabiUty distributions at time T, we do no discounting, and henceforth just write c(x) = cτ(x), p(x) = pτ( )-

Parenthetically, from these expressions we observe that ^•OO p(x) - c(x) = / (x - v)f(v) dυ = x - E_v[υ] = x - s*, Jo where s* = E_v[υ] is the expected value of the asset at the expiration date under the risk-neutral distribution. (If there are no dividends, then s* = se^rT; if there are dividends, then in general it is necessary to subtract from se^τT the value at time T of the dividends.) This weU-known relation is called put-caU parity; it shows why either price curve carries the same information.

From the above expression for c(x), it foUows that its first derivative is

c'(x) = - f f(x) dx = F(x) - 1,

Jx where F(x) = f f(v) dx is the cumulative distribution function (cdf) of the random variable υ. To prove this, note that υ — x = j dz. Therefore

c(x) = f (υ - x)f(υ) dυ = [^∞ dυ f dz f(v) = f dz f dυ f(v) = I dz (1 - F(z)),

Jx Jx Jx Jx Jz Jx where we interchange the variables u, z to integrate over the two-dimensional region 7 = {(v, z) : x < z < υ}. The last expression impUes that d(x) = -(1 — F(x)).

From put-call parity, it foUows similarly that p'(x) = 1 + c'(x) = F(x).

Since the cdf and pdf are related by F'(x) = f(x), these expressions in turn imply that the second derivative of either c(x) or p(x) is the pdf f(x): c"(x) = p"(x) = F'(x) = f(x).

The general character of the option price curves c(x) and p(x) is therefore as follows:

• For all x less than the minimum possible value of υ (i.e., such that F(x) = 0), c(x) = E„[ϋ] — x = s* — x and p(x) = 0. In other words, c(x) is a straight line of slope —1 starting at c(0) = Ε._v[υ] = s*, while p(x) = 0.

• For all x greater than the maximum possible value of υ (i.e., such that F(x) = 1), c(x) = 0 and p(x) = x — s*. In other words, p(x) is a straight line of slope +1 and _c-intercept s*, while c(x) = 0.

• These two line segments are joined by a continuous convex U curve whose slope increases from —1 to 0 for c(x), and from 0 to +1 for p(x).

We note that the fact that the mean E_v[υ] of the pdf f(x) is s*, the value in future doUars at time T of the underlying price s (less the value of any dividends), impUes that option prices must be constantly adjusted to reflect changes in the underlying price s, even if there is no market activity in the options.

The fact that s* = E_v[v] also impUes that an option price curve can make no prediction about the general direction of the underlying price s. However, the option price curve does predict the shape of the pdf f(x), and in particular its volatiUty. 1.1

based cm a finite subset of bid-a_3 Bd option prices

In practice, option prices c(x) and p(x) are quoted only for a finite subset of equaUy-spaced strike prices x, namely x = nΔ for integer π and spacing Δ. We denote c(πΔ) and p(nA) by c„ and p_n, respectively. Moreover, quotes specify only a bid-asked spread, not exact prices. In this subsection we give methods for dealing with these problems. (Most of the Jackwerth-Rubinstein paper (op. cit.) is concerned with these kinds of curve-fitting problems.)

The first derivatives d(x) and p'(x) at x = (n +

may be estimated by the first differences

-, _ Cn+l — Cn -. _ Pn+l - Pn

^Cn+ ^~ Δ ' ^{P n}+ ^~ Δ

The corresponding estimates of the cdf F_n+ = F((n + _j)Δ) are

+ ₁ = 1 + ^₁; F_n+ ι = p'_n+ __l .

Thus, using both bid and ask prices for both European-style puts and calls, one can compute four different estimates for the cdf F_{n+ 1} , which can then be combined into a single estimate.

This combination will preferably take into account whether x = (n + _j )Δ is much less than the underlying price s ("deep out-of-the-money"), near s ("near the money"), or much greater than s ("deep in-the-money"), according to the different patterns of setting bid-asked spreads in these different ranges. Another consideration is avoiding quotes near prices where early exercise is likely, such as deep in-the-money puts.

Similarly, the second derivatives c"(x) and p"(x) at x = nΔ may be estimated by the first differences of the estimates of the first derivatives; e.g.,

/_Jl - ^"÷ -- ^~ ^ⁿ~ _ CτH-1 ~ 2^C + ^Cn-l ^{ύ n ~ "} Δ²

We may take d'_n or p"_n or some combination as above as our estimate f_n of the pdf /(nΔ).

Note that since /(a.) > 0, option prices should satisfy a convexity condition, e.g., C_π+i — 2c_n + C_n-i > 0 for call option prices. Indeed, violation of this condition would allow making money via a risk-free "butterfly straddle" involving buying one call option at (n + 1)Δ and another at (n — 1)Δ, and selling two call options at nΔ. A similar result holds for put options.

1.2 Dynandc estimates

The methods considered in the previous subsection allow estimation of the cdf and pdf at a subset of Δ-spaced values of x, based on a static set of option quotes at a particular time.

As previously noted, however, option prices must change continuaUy in response to changes in the underlying price s. Let s* denote the corresponding forward price at expiration (the price s evaluated with interest). Suppose this price (measured in dollars at expiration) moves up (or down) by a small amount, an increment e in its logarithm, say, with little or no change in volatility. Here e may be viewed as, approximately, the percentage move δ/s* caused by a move of δ in the (forward) stock price. We expect in this situation that (forward) probability distribution for the stock price will just be shifted by e in the log domain. That is, the distribution wiU appear to be identical there, except with a mean shifted by e. Thus, the value of the new cdf at x = e^{ln x} is F(e^^{n x~}^) = F(x/a), where F denotes the original cdf with distribution mean s^*, and a = e^e. A reasonable call price functional equation that gives the same effect, upon differentiation, is ac(s*, x/a) = c( s^*, x), where c(s*, x) denotes the price, in dollars at expiration, for a call option at strike x when the underlying is at price s. Note in this equation that all other variables, such as volatiUty, are assumed to be the same, which will only be approximately true, even for very small values of e.

But, assuming this approximation, we can think of an option price at strike x, measured when the (forward) price has moved to as*, for a near 1, as giving instead a times the price of an option at strike x/a, but corresponding to the current underlying price &?. Considering all the strikes at which options are frequently quoted, and thinking additively, we can effectively observe c(x) (and p(x)) for a different subset of approximately equally-spaced strike prices, roughly x = nΔ — <5 for various values of δ = es*. Some care must, of course, be taken to ensure simultaneity of prices, of option and underlying. For this reason, we may prefer to consider the values of nΔ (corresponding to the various standard strike values) separately, and synchronize observed time of sales for an option at a given strike with the underlyling security. ImpUed volatilities (discussed below) could be monitored, to ensure their changes relative to e were smaU.

Using a similar technique to that described in the above paragraphs, meaningful average option prices for a given strike can also be computed, using thin strike intervals and using either short time intervals or time series methods (time averages weighting the present more than the past). Note that, without the framework described in this subsection, the computation of "average" option prices at a given strike are problematic when the stock price varies in the period over which the average is taken.

To summarize: Given enough movements of the underlying price, we can effectively observe prices and compute estimates as above for a much more finely quantized subset of strike prices x, and provide a framework for improving accuracy through averaging methods.

2 Methods for extrapolation and smoothing

There are two major limitations to the basic methods of the previous section. One is that option quotes are available only for certain expiration dates. Another, not so obvious, is that option quotes are reliable primarily for options in which there is substantial market activity. These would typically be nearer-term options at strike prices near the money (the underlying price).

To extend our prediction methods to times other than expiration dates and over wider ranges of strike prices (and also to help reduce "noise" in our displays), we use extrapolation and smoothing techniques. We have found that it is advantageous to do extrapolation and smoothing in the volatiUty domain.

There are many reasons for this advantage. For example, option practitioners are weU aware of the kinds of shapes that the volatility curves (sometimes called "volatiUty smiles") have had historically, in various markets, and how these curves vary with time; this can be a guide to imposing structure on the smoothing curves to prevent overfitting of possible artifacts. Many records have been kept of the volatiUties implied by option prices, and it is easy to examine how in the past they have changed with respect to price behavior. For example, the Chicago Board of Options Exchange makes pubUc its average near-the-money volatiUty index (now called Vix) for S&P 100 options back to 1986. FinaUy, it is easier to work visually with volatility curves, which would theoretically be fiat if f(x) were lognormal, than with visual differences in near-lognoπnal pdfe, which can all look very much alike. Mathematically, model improvements can be made in the volatility domain just by changing coefficients of low-degree polynomial approximations, even though these affect higher-order terms in power series for the corresponding cdfs or pdfs.

The following subsections explain more precisely how to work in the volatility domain.

2.1 ___ognαn_r_al pdfe

The standard Black-Scholes theory of option pricing (see Hull, op. cit.) yields a lognormal pdf f(υ) whose expected value is E_υ[υ] = s*, such that In υ is a Gaussian (normal) random variable with variance σ²T, where the parameter σ is called the volatiUty rate of the asset, and T is the time to expiration. By a standard property of lognormal distributions, this impUes that the mean of lnu is E_v[lnυ] = In s* — σ²T/2.

From this pdf foUows the famous Black-Scholes call option pricing formula [HuU, Appx. 11A]: c(x) = E„[max{u - x, 0}] = s*N(dι (x)) - xN(d₂ (x)), where N(dι (x)) and N(d₂ (x)) are values of the cumulative distribution function of a Gaussian random variable of mean zero and variance 1 at the points

(Recall that our version of the call price is not discounted, and is given in dollars at time T, and that s* is today's stock price, valued in dollars at time _T, less the value of any dividends.) Note that is the standard deviation of ln t>; therefore — d₂(x) is just nx, measured in standard deviations from the mean E_v[lnu].

Similarly, by put-call parity, we have the Black-Scholes put option pricing formula p(x) = c(x) + x - s^* = a^*(N(dι (x)) - 1) - x(N(d₂ (x)) - 1) = xN(-d₂ (x)) - s"N(-dι (x)).

Taking the derivative with respect to x, and using s*N'(dι (a.)) = xN'(d₂ (x)) and d[ (x) = (^(x) (the latter equation holding under the assumption of constant volatiUty, which we wiU later drop), we obtain

F(x) = c'(x) + 1 = -N(d₂ (x)) + 1 = N(-d₂ (x)).

Now F(x) is the probability that υ < x, which is equal to the probability that ln u < ln_ε, which since lny is Gaussian with mean E„[lnu] and standard deviation σ T is given by

F(x) = Pv{υ < x} = Pr{ln υ < \nx} = N f^{l x ~ E} ^^y\ \ _{= N}^_d2 ^

Thus we have verified that the Black-Scholes pricing formulas give the correct cdf F(x). The derivative of F(x) will thus yield the correct lognormal distribution f(x) = F'(x). 2.2

of general cdfs

Now let E(a:) be an arbitrary cdf on R₊; i.e., a function that monotonicaUy increases from 0 to 1 as x goes from 0 to infinity. For simplicity we will assume that F(x) is strictly monotonicaUy increasing; i.e., f(x) = F'(x) > 0 everywhere. Then there exists a continuous one-to-one "warping function" y : K_₊ → M. such that F(x) = N(y(x)) everywhere; i.e., such that the probability that a random variable υ with cdf F(x) will satisfy υ < x is equal to the probabiUty that, a standard Gaussian random variable n with mean zero and variance 1 wiU satisfy n < y(x). Similarly, there is an inverse warping function _ε(y) such that F(x(y)) = N(y).

Given the warping function y(x), the cdf F(x) may be retrieved from the relation F(x) = N(y(x)). Therefore the cdf F(x) completely specifies the warping function y(x), and vice versa; i.e., both curves carry the same information.

If F(x) is the cdf of a lognormal variable υ such that lnυ has mean E_υ[lnu] = Ins* — <P-Tf1 and variance σ² = σ²T, as in the previous subsection, then the warping function is given by

, , , lnx - (ln s* - σ²T/2) \nx - E_υ[lnυ] y(x) = -d₂ (x) = -^ = •

For this reason we may sometimes write y(x) as — d₂ (x), even when the cdf is not lognormal so that the right-hand equation above for d (x) does not hold.

2.3 I nplied. volatilities

If f(x) is not lognormal, then the Black-Scholes pricing formulas do not hold. Nonetheless, given an option price c(x) or p(x), it is common practice to define the implied volatiUty σ(x) as the value of σ such that the Black-Scholes pricing formula holds, for a given x, s and T.

The impUed volatiUty curve σ(x) so defined is a function of the strike price a. , which is constant if and only if the pdf f(x) is actuaUy lognormal. In practice, it is typically a convex U curve, called a "volatiUty smile." See, e.g., Hull, Chapter 17.

Prom Subsection 2.1, we can see that there is a second method of calculating implied volatiUties, as follows. Suppose that we have an estimate of the cdf F(x). Define the cdf-implied volatiUty σι (_c) as the value of σ such that the Black-Scholes cdf formula F(x) = N(—d₂ (x, σ, T)) holds, for a given x, s and T.

The first method has the advantages of being defined directly from raw price data, and of being well understood in the financial community. However, the second method has the foUowing advantages:

1. It is easier to calculate, at least from estimates of F(x);

2. It gives a simpler and arguably more intuitive relationship between volatiUty and the cdf F(x). If we use the traditional implied volatility σ(x), then the relationship is instead dc

F(x) = N(-d₂(x)) + -^σ'(x).

3. It fits better with the multivariate theory to be developed below. We have observed that the two curves σ(x) and σ_\ (x) seem to be fairly similar, at least as to the direction of their slope, and are generally not too far apart in value "near the money" . Also σ\ (x) = σ(x) whenever σ(x) has zero slope, though σi (x) is a little smaUer than σ(x) when the slope σ'(x) is negative (which often occurs for stocks). See the above equation. FinaUy, one function is as ad hoc as the other. Therefore, because of the above reasons, we generaUy prefer to use the cdf-impUed volatiUty curve σ (x).

In any case, it is clear that either σ(x) or σ_\ (x) contains the same information as any of the curves c(x), p(x), F(x) or f(x). From σ(x) or σ_\ (x) we can recover c(x) or F(x) using the Black-Scholes call option pricing or cdf formula, and from this we can obtain all other curves.

2.4 Extrapolation and smoothing in the alatility dcαnaiii

The volatiUty curve σ(x) or σ_\ (x) may be calculated pointwise from the corresponding curve c(x) or p(x) to give a set of values at a finite subset of strike prices x. Each of these values may be deemed to have a certain degree of reliabiUty.

It is then a standard problem to fit a smoothed and extrapolated curve σ(x) or σ_\ (x) to these points, taking into account their relative reUabilities. Any standard smoothing and extrapolation method may be used. In general, the usual problems of avoiding overfitting or oversmoothing must be addressed.

It is well-known that implied volatiUties also vary with time. We generally wish to estimate curves σ(x, T) or σ_\ (x, T) as replacements for the constant volatiUty σ in the Black-Scholes formulas, e.g., c(x) = c(x, σ, T) or F(x) = N(—d (x, σ, T)).

In an especially meaningful example, we have experimented with a class of smoothing algorithms used in "Implied volatility functions: Empirical tests," by B. Dumas, J. Fleming and R. E. Whaley, J. Finance, vol. 53, pp. 2059-2106, Dec. 1998. These authors fit an impUed volatility curve σ(x), for the purpose of setting up a "strawman" option price model for testing (and defeating) a theory regarding the role of volatility in option pricing. Their "strawman" option pricing model c(x) was obtained by putting the resulting smoothed curve back into the Black-Scholes call formula. It is a "strawman" ad hoc model, because no intuitive notion of stock volatility could possible vary with strike price, which the stock never "sees." Nevertheless, their model performed admirably, surpassing in predictive power the highly regarded "impUed tree" method. One possible explanation offered was that their model mimicked in a smooth way interpolation methods actually employed by practitioners in the options markets. (See the discussion of "VolatiUty matrices" in Hull, cited above.) Such an approach to option pricing seems ideal to us, because of its accuracy and because its underlying rationale represents a market view. Thus, we use the Dumas-Fleming- Whaley model for our own entirely different purpose, that of forecasting probability distributions. All that is necessary is to differentiate their call price model, which, conveniently for us, is a smooth function of strike price and other standard variables such as time, current stock price, and the risk-free rate of interest. The formula for the cdf F(x) is, as before, this derivative with 1 subtracted, or

F(x) = N(-d₂ (x)) + ^σ'(x).

We can make this very explicit. We have

^ _{= x}VfN'(-d₂(x)) where N'(z) denotes standard normal density, while σ'(x) may be computed by differentiating the Dumas-Fleming- Whaley fitted volatility curve. The latter has the form σ(x, T) = αo + αia; + a₂x² + ₃T + α₄T² + a$xT.

The coefficients {a_t} are determined by regression. This kind of quadratic curve-fitting is easily implemented. Dumas-Fleming- Whaley impose a constraint to prevent their volatiUties from going below 0 (or even below 0.01), and we have imposed further constraints on extrapolations (which we often carry out beyond the range of their tests), to ensure that the final cdf does not go below zero or above one. We have experimented with other variations on their basic approach, for example, using Unear interpolation in the time domain, where we do not need to take deivatives. Our methods would, of course work, with any approach, possibly quite different, to volatility curve-fitting, though the general Dumas-Fleming- Whaley approach has many things going for it: accuracy, conformity to marketplace use of Black-Scholes, smoothness (differentiabiUty, in particular), conformity to historical experience regarding the smile structure of volatiUty curves (especiaUy important for extrapolation), and simplicity (which, beyond ease of implementation, helps avoid overfitting). These advantages are achieved in a probability context that was not considered in the paper where these volatility curves were introduced.

3 The multivariate case

The methods in the previous sections are capable of generating a display of raw or smoothed and extrapolated probability distributions for any optionable asset. Option prices are quoted on a large number of securities, as well as on certain indices, such as the S&P 500.

However, an investor would also like to know future probability distributions for:

• His or her entire portfolio;

• Mutual funds;

• A security without a quoted option;

• A security in a hypothetical scenario.

All of these questions involve considerations of several securities at once, and the probabiUties of their simultaneous configuration of prices. This is clearly a consideration in the first two items above, but also enters in the third, where we would want to extract as much information as possible about the security without a quoted option price from those correlated with it that do have quoted options. Finally, in scenario analysis there are many questions that involve considering the probabiUties of several security prices occurring at once, including changes in factors influencing the market that might be modeled by changes in a portfolio of those securities most affected. We will take all of these issues up in the remainder of this document, but for now we just try to give a basic introduction.

For a portfolio of securities, or a mutual fund, we are interested in a composite asset of the form x = h_\x_\ + h x₂ + 1- h_nx_n, where the Xi are all assets for which we individually know the cdf F(xi) or the pdf f(xi). To give our method the most flexibility, we do not require that this knowledge come from any particular procedure, though we favor the approach of the preceding two sections. However, even for some securities or indices with a quoted option, we might not feel there was sufficient option activity to justify a fuU fit of a volatility curve, and might take a cruder substitute, even a flat straight line based on an average of available implied volatilities. In addition, it is convenient to allow the possibility that a few assets we are monitoring might not have any quoted option at all; this is easily accommodated by, say, using a flat volatility curve with a historical value for volatiUty. For testing purposes and comparisons we might even want to consider a list of assets with all volatiUty curves given this way. In any case our methodology here is very general, and we only require that we know warping functions yi xχ) such that F x{) = N fø)) for all i. If the asset has an active options market, then the warping functions may be determined by either first estimating F(xi) directly from (finite differences of) options price data, as in subsection 2.2, or by using the approach discussed later in Section 2 of extrapolating and smoothing in the volatility domain. In the latter case we have an explicit form of the warping function yi(x_{) in . terms of a fitted volatiUty curve σ_\ (xi, T) as yi xi) = —d (xi, σι (xi, T), T), and this equation can also be used with any volatiUty curve with the assets above, that might have fewer or no traded options. In a later section we will discuss portfolios in the logarithm domain, possibly containing long and short positions. One can think of warping them to standard normal directly, subtracting the mean and dividing by the standard deviation. Alternately, to keep our notation uniform, one can invent an asset with price x such that — d₂ (xi) gives this warped value (using for σι (ar) the observed historical volatiUty). But we wish to emphasize that the method we are describing works with ANY single- variable warping functions, even using a different one for each variable. The only further substantitive ingredient is the plausibility of using JOINTLY normal distributions, which we now discuss.

The general problem is to find a multivariate probabiUty distribution for the complete set of variables xi , . . . , x_n), or equivalently for their logarithms. In simple financial models generalizing the Black-Scholes framework, the multivariate distribution of the logvariables is multivariate (i.e., jointly) normal; see Musiela and Rutkowski's book "Martingale methods in financial markets" (1999). This implies that all portfohos of these logvariables are jointly normal, and can also be used with other logvariables and portfolios of them to form a jointly normal distribution. Thus, if we wish, it is reasonable to use BARRA (or functionally equivalent) factors as single (log)variables in our model, using, say, individual normal distributions for them based on historical volatility. These factors may represent fundamentals of companies or even macroeconomic variables such as interest rates. We do not further discuss such factors, but refer to the book of Grinold and Kahn cited above, which also describes how to closely approximate them as portfolios of security returns. Our preference is to not use BARRA factors directly, but stay as much as possible in the world of optionable securities, and address questions involving BARRA factors in terms of approximating portfohos consisting mostly of optionable securities. (But for testing and comparisons, it is still useful to be able to include them directly, and we do have that capabiUty.)

Now we certainly do not wish to use only the simple multidimensional Black-Scholes model, which would not directly allow the nonlognormal input from our single-variable distributions based on the options markets. At the same time, option prices on individual assets do not teU us anything about how assets interact, in particular, their correlations. Fortunately, correlations may be estimated from past (historical) data, and may be viewed as covariances for data that has been standardized (has standard deviation 1). Each multivariate normal distribution is determined by its mean and covariance matrix. Thus, a natural approach is to use the individual distributions to transform or "warp" the variables to standard normal, then impose a

10 multivariate normal structure based on the correlation matrix. This procedure is independent of the individual warping functions, which may be different for different individual variables, and in particular, can incorporate our market-based option distributions for individual variables representing securities with active options markets. A slightly different approach is to use correlations of the warped variables. This procedure is likely to be more accurate, but may involve more computational time.

We indicate some details. As before, it is notationally convenient to use υι as a second notation for Xi , favoring the latter for fixed values and the former as a variable. Let C be the historical correlation matrix of the log variables (In vi , . . . , lnv_n), whose entries are the cross-correlations

_ E(lnttj hn._j) — E(lnz;_t)E(lnt_j) ^Pij Ε(ln«_.)²E(lnϋi)^{2 '}

Then all diagonal terms pu are equal to 1, and C is a positive semi-definite covariance matrix, which we may here assume to be nonsingular (positive definite). If we use instead correlations of warped variables, we have simply

Pij = E(ViUj)-

Let us define Fc(y_\ , . • • , y_n) as the cdf of a multivariate Gaussian random variable with mean zero and covariance matrix C. Thus, Fc(bι , . . . , b_n) is the probabiUty that each variable yi is at most some value 6j. There are more elaborate versions, such as

. . . , θ_n; &_ , . .. , &n)_> giving the probability that each yι satisfies α,i < yi < b{. In the single- variable case these latter functions are obtained from the simple cdf by a single subtraction, involving two terms, but the corresponding bivariate case involves four terms, and in n dimensions there would be 2ⁿ terms. However, each of these more elaborate cdfs can be directly computed as an integral, just like the simple cdf. Since the more elaborate cdfs are needed for Monte Carlo calculations, possibly in high dimensions, it is best to think of them as being computed directly.

We then define the multivariate cdfs

F(a^;ι , . . . , x_n) = F_c(yχ (x\ ), . • • , y_n xn)), and

F(α_x , . . . , α_n; 61 , . . . , b_n) = F_c(y\ (α\ ), . . . , y_n(α_n) Vι (h ), • • • , 2/ n( ) where the yi(xi) are the known warping functions for the individual variables. We find it convenient, with some abuse of language, to speak of F(x_\ , . . . , x_n) as "the cdf" , even though we have all of the above functions in mind, and to use F(x_\ , . . . , x_n) as a proxy for the whole distribution (which it does, theoretically, determine). This multivariate cdf then has the foUowing properties:

• Since the marginals of Fc(z_\ , . . . , z_n) are Gaussian with mean 0 and variance 1, the marginals of F(xι , . . . , x_n) are equal to N(yt(xi)) = F(xi); i.e., they are correct according to each single-variable model.

If the logvariables (In v_\ , . . . , lnu_n) are actually jointly Gaussian, then the multivariate cdf F(x_\ , . . . , x_n) is correct.

In summary, the true joint distribution is approximated by a jointly lognormal distribution using historical correlations, combined with warping functions on each variable such that the marginal distribution of each variable is correct according to a selected single-variable model

11 (for example, according to our single-variable model for optionable securities, or according to the lognormal model using historical volatiUty). The single variables may actually be portfoUos, with a default distribution for the portfolio return being lognormal, based on historical volatiUty. This multivariate theory generalizes both our single-variable theory and standard multivariate (log)Gaussian models. It again aUows for market input through option prices, to the extent that components have an active option market, but does not exclude nonoptionable securities, and also aUows portfoUos as single variables. In this way BARRA (or functionally equivalent) factors are also allowed because of their interpretation as portfoUos of long and short positions.

4 Applications to portfolios

Given the multivariate cdf F(x , ... , x_n) = Fc(yι (xi ), • • • _. y_n x_n))_> we can answer many typical . questions. We first give an overview, and then take up some of the appUcations in more detail

As one example, suppose that we want to find the cdf of a portfoUo variable x = h\x_\ + h x₂ + • • • + h_nx_n, where the hi are arbitrary coefficients. A simple Monte Carlo method, probably not the fastest, is to draw random samples from the jointly Gaussian distribution with cdf Fc(yι , ... , y_n), transform each yi via the inverse mapping function Xi(yi), and then compute the resulting output sample x = hχxι(yι) + . .. + h_nx_n(y_n).

After enough samples, we will have an approximation to the cdf of x. More precisely, the ° probabiUty that a < x < b is, approximately, the average number of samples yι, . .. ,y_n with α < h_\xι yι) + . .. + h_nX_n(y_n) < b, and this approximation becomes exact in the Umit for large sample sizes. This works for real portfolios, or for portfolios constructed from a number of assets and a residual variable, as might arise from a regression. Usually the regression is done in the log domain, which we discuss below. Note that the Monte Carlo method just described works perfectly weU if the expression for x above is replaced by any function f(x_\ , . . . , x_n) of the x_», possibly quite nonlinear.

4.1 Log ckt_τ_airι portfolios

In this subsection, we point out how our methods fit with another paradigm in common use in the financial community, and set up some further notation. It is common to work in the return domain, or equivalently, with logarithms; i.e., lnx = /?ι lnxi + • • •β_n lnx„.

Ignoring any possible identification of these variables with those in the previous section, the same discussion and Monte Carlo method as above apphes, if we regard x as a nonUnear portfoUo x = f(xι, .. ^ _n) = exp(/?ι Inxi + • • -β_n lnx„). If the sum B of the β^s is 1, such an x may be written x = h xι + h₂x_% Λ (- h_nx_n where hi = βix/x_t. Even if B is not 1, incremental changes

("returns") dln computed from this equation for x are consistent with the above expression for Inx. It is common in the financial community to think of hi as approximately a constant hi, so

12 that for short periods, where the x'_ts do not change too much, this equation for x is comparable to the portfoUo equation in the previous subsection.²

For an asset x not given exphcitly in terms of the terms of the Xi, we obtain a similar expression via Unear regression:

Inx = βo In o + β_\ lnxi + V β_n lnx_n,

The βi for i 0 are correlation coefficients chosen to minimize the variance of the residual in historical data (perhaps subject to constraints, such as β_τ > 0 and ∑ - βi = 1). For example, x might be an security without a quoted option, and the Xi for i ψ 0 could be taken as assets for which we individually know the probabiUty distributions, in addition to the required correlation coefficients for x. We have written the residual term as βo lnx₀ (usually thinking of βo = 1 and the residual as normally distributed).³ The mean of the latter could be nonzero, giving the regression "alpha" — a constant term making the mean of the regression correct. Alternatively, we could modify the equation to allow an expUcit alpha, and keep the residual mean zero. Another minor variation might include the addition of a dummy variable with constant return,

to adjust the value of x up or down. In particular, this gives another way of adjusting the residual mean to zero. This equation gives the previous one as a special case if we allow βo = 0.

-

^■z

4.1.1 Fast fits of portfolios

One approach, which promises to be relatively fast computationally, is the foUowing. As in the development of cdf-impUed volatiUties in Section 2, let us assume that each logvariable In re* above is "Gaussian" with nonconstant variance σι(xj)²_T. In other words, the cdf is given by . _>^! F(xi) = N(—d₂(xi, σ_\ (xj), T)). Our aim wiU be to give F(x) by a similar equation, using some kind of fitted curve σ_\ (x). We will assume that we have some class of volatility curves in mind, with a small number of parameters which must be determined.

If the variables lnx₀, .. . , lnx_n were truly jointly Gaussian, then Inx would also be Gaussian. Its variance would be given by the formula

Var(ln ) = ∑βiσiPi_jβ_jOjT, i = Var(ln _t). We therefore define

σι (x) = E(β_iσι(x_i)pi_jβ_:ισι(x_j) | Inx = l xi).

The calculation of the above conditional expectation may be done with Monte Carlo methods. In the language of nonlinear portfoUos above, we would take the function /(x_{1 ?} ... , x_n) to be 0 outside a thin multidimensional soUd enclosing the hyperplane defined by lnw = ∑_^ _» ^mt,*)• Inside the soUd we would take f(x_\, ... , x_n) equal to the above expression for Var(lnx), divided by the probabiUty of being in the soUd (also a Monte Carlo calculation). In terms of samples, we just take the average of Var(lnx) over all the samples that end up inside the thin soUd. However, 3 Thus hx ^- = βι—f- = Ad Inx,, so that for small changes dx, the change dx from the first equation is approximately the same as would be obtained from the second. However, this relationship requires "rebalancing" to remain a good approximation for longer periods.

³ For the residual term i = 0, we can use a constant variance, or impose some generic nonconstant structure based on observed behavior.

13

** it is not necessary to compute all values of σι_.(x), but only enough to fit the parameters for the volatiUty curves we are using.

∑iβi ^{m s}i fr°^m risk-averse values of s*.)

Also, we mention here one useful variation: We may prefer not to view the residual term βo hi xo as part of the model, and instead write down a joint pdf only for In i , ... , In x_n. In this case we can use the double expectation σι(x) = ft In a )),

where the inner expectation is with respect to the variables

• • • . n_. an the outer ex-- pectation is with respect to the residual. We might take the standard deviation σ(xo) of t e,; residual (taking βo = 1) as a constant, determined historically, or make an estimate based on- some leverage model.

Now we can estimate the cdf E(x) by

F(x) = N(-d₂(x,σ₁(x),T)) as in the univariate case. To summarize, we use our multivariate model to determine parameters for a univariate model of the portfoUo. After that is done, we can obtain probabiUties for the- portfoUo without having to go back to the multivariate model, thus achieving a savings in time. We could take this one step further and think of randomly generating values of σι(x,T)T independently of any Monte Carlo philosophy (but perhaps still throwing away values of x too far out-of-the-money), and then using the values obtained to do the regression required in the Dumas-Fleming- Whaley approach.

5 "What-if" questions

The multivariate distribution lends itself to the study of many questions regarding conditional probabiUties. For example, suppose that we want to know the effect of the increase or decrease of some segment of the market on a portfoUo, or the increase or decrease of some macro-economic - factor. BARRA, foUowing earlier ideas of Ross, has viewed such macro-economic factors as portfoUos with both long and short positions. Similarly, BARRA considers market segments associated to price-to-earnings ratios and other fundamental parameters, as well as to industry groupings, as portfoUos. (See the book of Grinold-Kahn cited above.) Thus, we are led simply to consider the effect of one portfoUo on another.

For definiteness, let us suppose the first portfoUo is x, where as above

Inx = βo lnxo + β_\ \a.x_\ + • • • + /?_n ln _n, and the second portfolio is y, where lny = o lnyo + 7ι hixi + • • ^• + η rz„.

We take βo = lo = 1, and view lnx₀ and ε = lnyo as residuals with mean 0. The latter residual is not assumed to be a factor in our multivariate model. Consider the foUowing typical "what-if" ^"

14 question: Let A and B be given positive constants. If we know x > A at time T, what is the probabiUty that y > B at time _T? We give two approaches to this problem, the first probably quicker, but possibly not as accurate, using a regression to avoid at least some Monte Carlo calculations.

5.1 '^eWhafc-if": An approac mvolving part regression, part ]V____πte Cario

We have lny > In_3 iff lny— e > ln_3 — e. AU correlations pi_j between Inx, and lnx^ are assumed known. We may also assume that we have historical values of volatiUties σ^ = Nar(lnX_t). (Alternatively, we could estimate such values as expected values of impUed^"" volatilities, but it would not be difficult to maintain an inventory of historical values, and more in the spirit of this part of the calculation to do so.) Thus we can estimate the historical covariances between Inx and lny — ε:

Cov(lnx, lny — ε) « 2__Jβ_ιθ'iPi_jljθjT,

».j as weU as σ_{\a x} = ^ Nar(lnx), σι_{n y}-_e = yΛ a^lny — ε) and the correlation

Cov(lnx,ln — ε)

P (ε) = Pln x,ln y-ε =

Oi_l! χO\_n y—_e

This gives a standard regression for the variable lny — ε expressed in standard deviations from its mean, in terms of a similarly standardized expression for Inx. Note that ε has mean 0 by construction. Put e_₂(s*, x, σ) = ^^s*'^{x ~}^^τ'². Thus — d₂(s_x,x, σ_{a x}) measures standardized Inx using historical volatiUty, and — d₂(x) = —d (s , x, σ_\ (x)) measures "standardized" (warped) Inx using the cdf-impUed volatiUty curve σi (x), as discussed in the previous section. Here s* denotes our best estimate for the value of x at time T.

Let σ_\ (y, ε) denote the volatiUty curve associated with lny — ε, which may be estimated as in the previous section (or computed from estimates of σ\ (y) and the standard deviation of the residual, if we are wilUng to view the residual as uncorrelated with lny — e, as is guaranteed in unconstrained regression). Put d₂(y, ε) = d₂(s*,ye^~ε, σι(y,ε)), so that — d₂(y,ε) is a "standardized" measure of ln — e. Then the standard regression appropriate to our model is

-<k(y,ε) = p( )(~d₂ (x)).

There is a residual associated with this regression, which we have not written down. It is presumably normal, and its variance may be computed. For notational reasons we wiU just imagine it has been incorporated into the original ε. As is apparent from the form of the expressions in the display, an alternative to the above regression is to do it with the warped correlation coefficients suggested in the previous section. If, in addition, it was appropriate to view the original portfoUos as linear combinations of warped variables (our standard normal marginals), the regression above could be done without any recourse to Monte Carlo calculations. Similar remarks would apply if we used constant historical volatility functions throughout, though presumably the latter procedure would lose accuracy.

In any case, we can now answer our "what-if" question as a simple expectation in the univariate normal distribution of the (adjusted) residual ε. Abbreviate c_2(s*., -4,σι (_4)) to d₂(A) and

15 d (s_y, Be ^e, σ-χ (B, ε)) to d₂ (B, ε). Assume p (ε) > 0 (the natural case of a positive correlation). Then we have

Pr{y > B \ x > A} = E(Pr(-__₂(y,ε) > -d₂(B,ε) \ - <h(x) > d₂(A))

= E(Pr(-d₂(x) > p(εT^l (-<k(B,ε)) \ - d₂(x) ≥ -d₂(A))) = E (min{l, N(p(_£)^~1 (<k (B, ε))/N(d₂ (A ))}).

The -first equation foUows just because —d (y,ε) is monotonicaUy increasing as a function of y; that is, the condition that y > B is completely equivalent to the condition — d₂ (y, ε) > —di₂(B, ε). Si Uar remarks hold for the condition x > A, while the expression Pr{y > B

> A } just means the probabiUty that the condition y > B holds when it is known that x > A. The second equar tion is then derived with the displayed expression above for — efø(y,ε). (If p(ε) is negative, the inequaUty involving its inverse reverses.) This inner expectation is then calculated in the normal distribution. For values of ε for which —d₂(A)) is as large as p(ε)^~l (—<k(B,ε)), the expectation is a certainty, and yields the value 1. When — < (A)) is smaller than p(ε)^~l (— d_t(B,ε)), its cumulative normal distribution value N(—d₂(A)) is smaller than N(p(ε)^~l(—d (B,ε)), and the probabiUty 1 — N(—d (A)) — N(d₂(A)) that the standard normal variable z = — d (x) is at least — <h(A) is smaUer than the corresponding probabiUty 1 — N(p(ε)^~l (— d2(B,ε)) = N(p(ε)^~l (d₂ (B, ε)) that z be at least p(ε)^~1 (-d₂ (B, ε). The ratio N(p(ε)^~ (<h (B, ε))/N(di (A)), which is the desired inner expectation, is thus smaller than 1, as is appropriate for a probabU- ity, conditioned or not. If p (ε) is negative, similar reasoning leads instead to the expression E(max.{Q, (N(d₂(A)) - N(plε)-¹ (d₂(B,ε)))/N(d₂(A))}) for the desired conditional probability. Although the final answer in either case is an expectation (over ε), it is essentiaUy an integral that could be computed quickly with power series. (A very simple and accurate power-series expansion of N(z) is given on p. 252 of the book by HuU cited above.) Using that, one could determine by iterative methods what value of ε makes, say, the ratio i\T(p(ε)^-1 (d₂ (B, ε))/N(d₂ (A)) equal to 1, and then integrate the ratio against the standard normal pdf from — oo to the determined value of ε,in the p(ε) > 0 case. Sinnlar remarks apply if p(ε) < 0. (Note that, if p(ε) = 0, the variables Inx and lny are uncorrelated, and the conditional probabiUty Pr{y > B\ x > A} is the same as the unconditional probabiUty Pr{y > B}.)

AU of the latter calculations can be done very fast. Of course, we have already used some Monte Carlo calculations to get this far, unless we are in the simplified context of constant volatility functions.

5.2 '^<vVha_ if": The full JVfaπte

It is easy to say how we would compute an answer to the same "what-if" question, using our fuU joint probability distribution. We simply write

Pr{y > B \x ≥ A} = E(Vτ(-d₂(y,ε) ≥ -d₂(B,ε) \ - d₂(x) > -d₂(A)) and interpret Inx in — d (x), and ln — ε in — d₂(y,ε) in terms of their expansions in ln o, lnxi, . .. , lnx_n. To compute, say the inner expectation by a Monte Carlo calculation, we would generate a large number of random samples of multivariate standard normal vectors z with covariance matrix C. We then take the average, over the samples z which happen to satisfy z > —d₂ (A), of the function which is 1 when — d (y, ε) > —d₂ (B, ε) and 0 otherwise. We have not experimented to see whether this method yields better answers than the regression procedure above. Nevertheless, it illustrates how we could approach more sophisticated "what-if' questions

16 that could not be easily treated by regressions. For example, suppose we beUeve that factor w will remain in a range C < w < D, and ask the same question about y, subject to the same condition on x. This is hard to formulate in terms of regression, and is simply not possible in terms of single-factor regression. However, it is easy to answer with the full distribution:

Pr{y ≥ B \ x ≥ A, C < w < D} =

E(Fτ(-d₂(y,ε) ≥ -d (B, ε) \ -d₂(x) > d₂(A)), -d (C) < -d₂(w) < -d₂(D)).

Finally, we may not want to work in the log domain, which, if we started with a fixed portfoUo x = h\x\ + h₂x₂ -I 1- _nXn would force us into an approximation, as noted.=_But, working with the fuU distribution, we can phrase a condition x > A as h_\X_\ (y_\) + . . . + nX_n(y_n) > A, in the language of the first section where the vector of y's plays the role of our vector z here. Monte Carlo calculations can now proceed as before, using log domain expressions or not for the other conditions.

6 "You gotta believe" questions

In the previous section we were focusing on an investor thinking about the value of his or her portfoUo y in response to the change in a factor x. Conversely, an investor might want to know what the investment world looks like if a given stock or index y goes to a certain level B at time T. What is the expected value A at time T of another portfoUo x, or simply of one of the factors Xi? Our main plan is, upon input by the user that y is going to level B, to Ust several assets Xi or factors/indices x most highly correlated with y and their expected values with y at B.

It would also be possible to display a confidence interval for each selected asset or factor, and have other information about its new projected probabiUty distribution readily available. We could also offer comparisons with the old projected probabiUty distribution of x, where no assumptions on y is made. FinaUy, in some cases, where it was possible to explain much of the variance of y with just a few * (appearing in the regression of y), we could Ust percentage increases/decreases of a portfoUo of these Xj required to make B the expected value of y, based solely on its dependence on this portfoUo. (For example, the coefficients in the portfoUo could come from the regression of y with respect to all the Xi, or some new regression might be done, perhaps aUowing user-defined constraints). It should be mentioned that medians or modes are alternatives to expected values (means) here and above; in any case users wiU need to be educated about the fact that the median and mode differ systematically from the mean in near lognormal distributions.

The main problem might be viewed as understanding the probability distribution of x, given that y > B at a given time T, with x and y as in the previous section. This can be approached by the methods of the previous sections, by reversing the roles of the variables.

There is, however, a simpler question that can be treated in an especially quick way. Consider the problem of determining the mean of x conditioned on the equaUty y = B at time T. The idea is to use simple regression methods, but interpret answers as measured in terms of our variable volatiUties. In our previous notation, we have a regression

-<h (x) = p - (-<k (y, ε)) + v where p (which we called p(ε) in the previous section) is the historically determined correlation between In and the random variable lny— ε. Note that the roles of dependent and independent

17 variable are reversed. There is also a residual υ, which has mean 0 here, and plays no role (gets averaged away). Thus, the desired conditional expected value A of x is obtained from

-^<k(A) = E(-d₂(x) \ y = B) = p - E(-d₂(s_y, Be-^e, σ₁ (y,ε))) = p - (-M > ^B>Zι (v,ε))-

Recall that σ\ (y, ε) is an estimate, obtained by Monte Carlo methods, of the impUed volatiUty σ\ associated to the random variable lny — ε. For faster but less accurate calculations it can be estimated historicaUy as 2i,_jβi^σiPijβj^crj w^h each of the σ's, β and p's here given histori- caUy. (See the previous section for notation.) Similarly, for fast calculations, — d₂(x) could use historical volatiUty, though we expect it to be given more accurately, or rather, more accurately according to the market view, as — <fa(x) = — d (s_x,x,σ_\(x)), using the impUed volatiUty function estimate σ\ (x). If x = x,- is a single asset or index in our model, then σ_\ (x) = σ_\ ( <) does not require a Monte Carlo estimate, but is presumably already available.

To summarize, the conditional expected values required to answer "you gotta beUeve" questions are eas y obtained by regression methods. The accuracy of such answers is enhanced, or at least shaped more to reflect market input, when aU logvariables are measured in "standard deviations," interpreted as our variable volatiUties.

7 Portfolios containing option securities

We conclude this document by briefly pointing out that our methods, when using full Monte Carlo calculations, easily apply to portfoUos containing option securities. The weU-known idea is to think of an option as as a kind of nonlinear portfoUo — a quadratic one, to be more precise. Thus, an option on a single underlying security with underlying price x_\ has a price approximately x = c+Δ(xι— s₁)+(l/2)r(xι — sχ)² forxt near s_t, where the option was evaluated to a known value c. Here Δ and V are weU-known parameters in the options markets, giving the first and second derivatives of the option price at s_\ with respect to the underlying security price i . Perhaps the most characteristic feature of options is that they have nonzero T — their proportion of increase or decrease with respect to the underlying security price changes as the security price changes. ExpUcit formulas in terms of other standard parameters are available, say, in the Black-Scholes theory for both Δ and T (see the Huh book cited above). Such formulas could be obtained by differentiation directly in other theories or when using empiricaUy-fitted curves. In any case, once we have such an expUcit approximation to x, its probabiUty distribution is easdy given by the Monte Carlo methods of Subsection 3.1 above. The same method appUes as weU to portfoUos containing several options and other securities.

18

Claims

1. A method comprising:

receiving data representing cunent prices of options on a given asset,

deriving from said data an estimate of a conesponding implied probability distribution of the price of said asset at a future time, and

making information about said probability distribution available within a time frame that is useful to investors.

2. The method of claim 1 in which the data represent a finite number of prices of options at spaced-apart strike prices of the asset, and also including

calculating a set of first differences of said finite number of prices to form an estimate of the cumulative probability distribution of the price of said asset at a future time.

3. The method of claim 2 also including

calculating a set of second differences of the finite number of strike prices from the set of first differences to form an estimate of the probability distribution function of the price of said asset at a future time.

4. A method comprising:

receiving data representing current prices of options on a given asset,

41 deriving from said data an estimate of a conesponding implied probability distribution of the price of said asset at a future time, and

providing a real-time data feed containing information based on said probability distribution.

5. A method comprising:

providing a graphical user interface for viewing pages containing financial information related to an asset; and

when a user indicates an asset of interest, displaying probability information related to the price of the asset at a future time.

6. A method comprising:

enabling a user to identify an asset of interest, the asset being one for which data representing cunent prices of options on the asset are available,

providing a display of a probability distribution of prices of the asset at future times.

7. A method comprising:

enabling a user to indicate a future time and to identify an asset of interest, the asset being one for which data representing cunent prices of options on the asset are available, and

displaying to the user a distribution of the probability that the asset will reach prices within a range of prices at the future time.

42

8. A method comprising:

receiving data representing cunent prices of options on a given asset, the options being associated with spaced-apart strike prices of the asset at a future time,

the data including shifted cunent prices of options resulting from a shifted underlying price of the asset, the amount by which the asset price has shifted being different from the amount by which the strike prices are spaced apart, and

deriving from said data an estimate of a quantized implied probability distribution of the price of said asset at a future time, the elements of the quantized probability distribution being more finely spaced than for a probability distribution derived without the shifted cunent price data.

9. A method comprising

deriving from said data an estimate of an implied probability distribution of the price of said asset at a future time, the mathematical derivation including a smoothing operation, and

10. The method of claim 9 in which the smoothing operation is performed in a volatility domain.

43

11. The method of claim 9 in which the smoothing operation is performed in the domain of the option prices or in the domain of the probability distribution information.

12. A method comprising:

receiving data representing cunent prices of options on a given asset, the options having strike prices at future dates,

deriving a volatility for each of the future dates in accordance with a predetermined option pricing formula that links option prices with strike prices of the asset;

generating a smoothed and extrapolated volatility function;

and using the volatility information to generate information within a time-frame that is useful for investors.

13. The method of claim 12 in which the volatility function is extrapolated to a wider range of dates than the future dates.

14. The method of claim 12 in which the volatility function is extrapolated to strike prices other than the strike prices of the options.

15. The method of claim 9 also including

generating a smoothed volatility function using only data that are reliable under a predetermined measure of reliability.

16. The method of claim 9, further comprising:

generating an implied volatility function formula having a quadratic form with two variables representing a strike price and an expiration date;

44 wherein coefficients of the implied volatility function formula are determined by applying regression analysis to approximately fit the implied volatility function formula to each of the implied volatilities.

17. A method comprising:

receiving data representing cunent prices of options on assets belonging to a portfolio,

deriving from said data an estimate of an implied multivariate distribution of the price ofa quantity at a future time that depends on the assets belonging to the portfolio, and

18. A method comprising:

receiving data representing values of a set of factors that influence a composite value,

deriving from said data an estimate of an implied multivariate distribution of the price of a quantity at a future time that depends on assets belonging to a portfolio, and

19. The method of claim 18 in which the mathematical derivation includes generating a multivariate probability distribution function based on conelations among the factors.

20. A graphical user interface comprising:

45 a user interface element adapted to enable a user to indicate a future time;

a user interface element adapted to show a cunent price of an asset; and

a user interface element adapted to show the probability distribution of the price of the asset at the future time.

21. A method comprising:

continually generating cunent data that contains probability distributions of prices of assets at future times,

continually feeding the cunent data to a recipient electronically, and

the recipient using the fed data for services provided to users.

22. A method comprising:

receiving data representing cunent prices of market transactions associated with a second portfolio of assets, and

providing information electronically on the probability that the second portfolio of assets will reach a first value given the condition that the first portfolio of assets reaches a specified price at a future time.

23. A method comprising:

receiving data representative of actual market transactions

46 associated with a first portfolio of assets;

receiving data representative of actual market transactions associated with a second portfolio of assets;

providing information on the expectation value of the price of first portfolio of assets given the condition that the second portfolio of assets reaches a first specified price at a specified future time through a network.

24. A method comprising

evaluating an event defined by a first multivariate expression that represents a combination of macroeconomic variables at a time T, and

estimating the probability that a second multivariate expression that represents a combination of values of assets of a portfolio will have a value greater than a constant B at time T if the value of the first multivariate expression is greater than a constant A.

25. The method of claim 24 in which the probability is estimated using Monte Carlo techniques.

26. A method comprising

defining a regression expression that relates the value of one variable representing a combination of macroeconomic variables at time T to a second variable at time T that represents a combination of assets of a portfolio, and

estimating the probability that the second variable will have a value greater than a constant B at time T if the value of the first variable is greater than a constant A at time T, based on the ratio of the probability of

47 x being greater than A under the regression expression and the probability of x being greater than A.

27. A method comprising

defining a cunent value of an option as a quadratic expression that depends on the difference between the cunent price of the option and the cunent price of the underlying security, and

using Monte Carlo techniques to estimate a probability distribution of the value at a future time T of a portfolio that includes the option.

28. A method comprising

displaying to a user a circular visualization element having sectors arranged around a center of the element, the sectors respectively conesponding to different groups of assets,

in each of the sectors, displaying an array of visual elements representative of respective assets belonging to the group to which the sector conesponds, the visual elements being anayed with respect to distance from the center in accordance with magnitudes of performance of the assets during a recent period.

29. The method of claim 28 in which the visual elements comprise displayed dots, one for each of the assets.

30. The method of claim 28 in which the visual elements exhibit visible characteristics that conespond to categories of the assets within the group.

31. The method of claim 30 in which the categories of the assets within the group conespond to different capitalizations.

48

32. The method of claim 29 in which dots are ananged along a radius of the sector to which they belong.

33. The method of claim 32 in which dots that would otherwise lie on the radius at a given distance from the center are displayed at different angular positions near to the radius.

34. The method of claim 28 in which the sectors have angular extents that represent the fractions of the total number of asset items represented by the respective sectors.

35. The method of claim 28 in which the circular visualization element is subdivided into rings having respectively different distances from the center.

36. The method of claim 35 in which the rings are displayed in different colors.

37. The method of claim 28 in which the magnitudes of performance of the assets are measured in percentage price change.

38. The method of claim 28 in which the recent period comprises a trading day on an asset market.

39. The method of claim 28 in which the assets comprise securities issued by coφorations.

40. A displayed visualization element that

is circular,

has sectors ananged around a center of the element, the sectors respectively conesponding to different groups of securities issued by coφorations,

49 in each sector, has an array of dots representing respective securities belonging to the group to which the sector conesponds, each of the dots lying on or near a radius of the sector and each having a distance from the center along the radius that conesponds to the percentage change in the price of the represented security during a trading day, and

has differently colored rings at respectively different distances from the center.

41. A method comprising

displaying to a user a visualization element that indicates the odds of a performance measure of an asset being within specified ranges of identified values of the performance measure at a succession of times in the future.

42. The method of claim 41 in which the performance measure comprises a price of the asset.

43. The method of claim 41 in which the performance measure comprises a return percentage.

44. The method of claim 41 in which the performance measure comprises a tax-adjusted return percentage.

45. The method of claim 41 in which the visualization element include stripes superimposed on a graph of the performance measure over time, each of the stripes representing one of the specified ranges.

46. The method of claim 45 in which each of the stripes begins at a current time and becomes broader as it extends to future times.

47. The method of claim 41 also including

50 displaying a graphical device that shows actual historical values of the performance measure.

48. The method of claim 47 in which the graphical device that shows actual historical values is a line graph one end of which joins the visualization element at a point which represents a current date.

49. The method of claim 41 in which the visualization element includes two portions, one of the portions representing the odds prior to a specified date based on one assumption, the other of the portions representing the odds after the specified date based on another assumption.

50. The method of claim 49 in which the specified date is a date on which tax effects change from the one assumption to the other assumption.

51. A method comprising

displaying to a user a visualization element having graphical indicators of the relative performance of a selected asset compared with the performance of groups of assets in each of a succession of time periods, each of the groups comprising assets representing a common style.

52. The method of claim 51 in which the style comprises a class of investment objectives.

53. The method of claim 51 in which the relative performance is determined using an asset class factor model.

Attorney Docket 11910-002001

51