|
|
To add a forecast, choose the Add Forecast command from
the Forecast menu. The dialog shown below is presented. The
field at the top shows the location on the worksheet of the
upper left corner of the forecast display. This option allows
an arbitrary placement of the forecast display and several
forecasts may be placed on the same worksheet.
|
|
|
The Name field determines named ranges on
the worksheet. This name must be unique for different
forecasts. It must start with a letter and include no
spaces or punctuation.
The Data Columns field specifies the number
of individual time series that will be placed on the
display. It might be useful to have more than one if
the user has a several series to be simultaneously tracked
such as the stock prices in a portfolio or the sales
volumes for different products. Each time series will
have a separate forecast.
The Time Horizon is the number of
periods to be included in the series. This number can
be changed using the Change command. The History is
the number of periods of data used to obtain the first
forecast. All the methods require this warm-up period
to obtain valid forecasts. This history is important
because it limits several features of the forecast. For
instance the number of observations in a moving average
and the number of observations in a regression forecast
are limited to the number of periods in the history.
The forecast interval is also limited by the history.
|
|
|
A nonzero entry in the Extra Data Columns field will
cause the add-in to create extra columns that are placed to
the left of the data columns. These might be useful for holding
associated information related to the data. For example on
the Investments page we
use a single extra data column to hold the dates associated
with closing stock prices.
A nonzero entry in the Extra Results Columns field
will cause the add-in to create extra columns in the display
that appear to the right of the forecasts. These might be useful
for additional processing of the forecasts. This is illustrated
on the Investments page
where we use several extra columns to make stock buy and sell
decisions.
When the Simulate button is checked, data will be
simulated using Monte Carlo simulation. The results are similar
to the Simulation command on the forecast menu, but
the data placed on the worksheet is fixed, rather than controllable
through simulation parameters.
When the Freeze Panes button is checked, the worksheet
window is divided into four sections and the panes are frozen.
This is useful for large data sets where the active data cells
may be far removed from the row and column titles.
The Seasonal button allows the analysis of time series
with cyclical variations. The number of data points in a cycle
is placed in the field below the button.
The Include Forecast button determines whether the
display will include forecasts for future times. For example
a moving average forecast places one column on the display
holding the moving average. When this button is checked, two
additional columns are also included, one holding forecasts
for future times and one holding error measurements for the
forecast. Usually one would check this button, but for some
applications future forecasts are not required. Again we illustrate
this on the Investments page.
The other pages of this section all include forecasts.
The dialog requires the selection of one of the forecasting
methods using the buttons on the right. In the remainder of
this page, we illustrate the various forecasting methods. |
Moving Average |
|
|
Clicking the OK button causes the add-in to construct
the form shown at the left. Cell C3 gives the name of the
data. This cell should not be changed. Cell D4 contains
the number of data points to be included in the moving
average. This number can be changed to observe the effects
of different lengths. It can be no greater than the history,
10 in this case.
The data is in column C. Cells C10 through C19 hold data
for the warm-up period. For a moving average of 10, there
must be at least 10 data points available for the first
forecast. The data shown is a simulated random variable
with a mean of 50 and a standard deviation of 5. Column
D holds the estimate of the mean of the time series. The
moving average estimates only the mean. Column E shows
the forecast. The time interval, t, for the estimate
is set in E4 by the user. For the example t is
2. The number can be changed but is limited from above
by the history. The entries in column E are offset by t periods
from the moving average value from which the forecast was
derived. For example, the entry in E21 is the forecast
for time 2 based on the moving average computed for time
0 in cell D19. The two values are equal because the moving
average assumes a constant for the underlying mean value
of the time series. The column is labeled Fore(2) to indicate
that it holds an estimate of the time series based on the
mean value estimated two periods earlier. |
|
|
Column F computes
the forecast errors. The error is simply the difference between
the value observed for a period and the forecasted value. For
example, the entry for period 2 which appears in cell F21 is
the value of the data in period 2 (cell C21) less the value
of the forecast (cell E21).
The means and standard deviations of the columns are computed
in rows 6 and 7. The moving average column should have less
variability than the data column because moving average eliminates
some of the noise variability. The error column variability
depends on both the data variability and the variability of
the forecasts.
Row 8 (cell F8) shows the MAD or Mean absolute deviation of
the error results. This is sometimes used as a measure of forecast
quality.
In practice, one would fill in a table like this one period
at a time as data is observed. Data not yet available are indicated
by ***, as shown for periods 16 through 20. Moving averages
are computed for these periods because the moving average function
looks back the specified number of periods (10) and uses as
many numeric values as available. For example, the moving average
for period 20 (in cell D39) is the average of the five data
points for periods 11 through 15. |
|
|
The selection of the number to be included in the moving
average is the prerogative of the analyst. If the mean
of the underlying time series is truly constant, the average
should be as long as possible. In reality the mean value
may be changing. If the forecast is to discover changes
quickly, the number in the average should be low.
The same series is shown at the left with 5 periods in
the moving average. The standard deviation of the forecast
has increased because the noise has more effect because
of the smaller number of periods in the average. |
|
Exponential Smoothing |
|
|
The display at the left shows the same data with forecasts
provided by the exponential smoothing method.
This method has a single parameter Alpha. A small
value of Alpha tracks rapidly varying time series
better than a large value, but a small value is more affected
by noise. The value of Alpha for the example is
in cell J4. The default value is
2/(history + 1)
Alpha can be changed by the user.
The forecast for time 0 (cell J19) is computed by a moving
average with length equal to the history (10 in this case).
Exponential smoothing requires such an estimate to get
started.
The time interval of the estimate is in K4. The estimates
for the mean of the series are in column J and the forecasts
are in column K. Since exponential smoothing only estimates
the mean, the forecast will be the same as the estimate t periods
earlier. |
|
Regression |
|
|
The regression method is similar to the moving average
method in that it uses a fixed interval to determine forecasts.
The example uses 10 periods. The method takes the last
10 observations and constructs a linear regression estimate
for the mean of the time series. There are two results
of the regression, a constant term labeled Reg. a, and
a linear term labeled Reg. b. Estimates are computed with
the linear expression:
Est(t) = a + bt
where t is the interval between
the time of the estimation and the estimated mean value.
For example, the estimate for time 4 (cell R23) is based
on the regression coefficients computed at time 3 (cells
P21 and Q21). With all values rounded to two decimal places,
this is
Est(2) = 49.95 + (0.22)(2) = 50.40
The numbers used by Excel have many significant
digits of accuracy, but when rounded values are shown the
results seem to have small errors. For the example we would
expect 50.39 as the result, but actually the value 50.40
is more accurate.
The data observed for time 4 (cell O23) is:
38.03, so the error for time 4 (cell S23) is
38.30 - 50.40 = -12.11
The regression method is useful for time
series that have a trend component. Again the choice of
the length parameter is important. For rapidly changing
series the length should be short. For slowly changing
series that can include a trend the length should be long.
The value of the length can be changed in cell P4, but
it cannot exceed the history. |
|
Exponential Smoothing with Trend |
|
|
This method is also used when the series might have a trend.
It is similar to the exponential method, but also estimates
the value of the trend component. Like the regression method,
both the constant term and the trend are estimated. It has
two parameters, Alpha and Beta. The estimates
for time 0 are obtained with a regression forecast. |
|
More than One Forecast |
|
|
The Add Forecast dialog allows for the analysis
of more than one series . The dialog is shown below for
two series.
The forecast is produced below using simulated
data. Both forecasts use the same method but may have different
parameters. |
|
Seasonal |
|
To illustrate the
use of the Seasonal button, we consider a time series
that observe the visits to a web page. Data showing the visits
to the site for a 28 day period is shown below. It appears
that there is a weekly variation in the data with the fewest
visits on Saturday and Sunday. We have tabulated the first
three weeks at the top right with the total and average number
of visits per day. At the bottom right we divide the visits
by the average for each week to compute the relative number
of visits compared to the average. Finally we average these
over the three weeks to compute an adjustment factor for the
days.
To analyze the data we click the Seasonal button
on the Add Forecast dialog and indicate the Season
Cycle as 7. One week is specified as the history and we
provide room for 28 days of data.
The forecast is produced below with the data
placed in column O. Column P is for the adjustment factors.
The first 7 cells in this column hold the factors computed
above. The remaining cells in this column hold equations linking
to the first cells. Thus changing the factors in the first
7 cells will change the entire column. The adjusted data is
computed in column Q by dividing the data by the adjustment
factors. Columns R and S perform the exponential smoothing
with trend to forecast the adjusted data in column T. The forecasted
visits in column V are computed by multiplying column T by
the adjustment factors.
|
|
|