Estimating capital project budgets can be a tricky business. There are many factors to include as inputs to the budget: vendor quotes, capital labor, permits, third- party inspections, expenditure data from previous projects, material escalation… the list goes on and on. Unfortunately, each input can itself be an estimate, and each input can present cost uncertainty risks. Because it is impossible to calculate a precise, infallible budget (owing to the nature of project risk and cost uncertainty), numerical analysis is a good option for arriving at a solution. In the case of predicting capital budgets, the Monte Carlo Simulation (MCS) is the numerical method of choice.

What is a MCS? In terms of project cost risk, the MCS is a tool for calculating the statistical likelihood of exceeding a base budget by a given value. The goal is to assign a statistically-derived dollar value to the various risks associated with the project. The MCS calculates budget values and likelihoods based on two key inputs for each budget line item: 1) the forecasted cost distribution for the expenditure, and 2) the chance of incurring the expenditure.For example, Item A should cost between $70,000 and $85,000, and there is a 100% of incurring the cost; Item B should cost between $10,000 and $15,000, but there is only an 80% chance of incurring the cost. (Table 1).**Building Your Own MCS Tool**The most important aspect of understanding MCS mechanics is knowing how to generate meaningful cost probability distributions. There are many probability distributions that could be used to describe forecasted cost: normal distribution (standard bell curve), uniform distribution (see Table 2), log-normal distribution, beta distribution, etc. The problem lies in translating straightforward cost data into statistical distribution lingo. It takes a skilled statistician to, for example, analyze limited data and generate the α and β parameters of a beta distribution. Moreover – how does “α=2, β=5” translate to a $10,000,000 line item? The beta parameters have no clear meaning on the surface.Fortunately, there exists a straightforward and easy-to-generate distribution that is well suited to MCSs: the triangular distribution (TriD). A TriD has three simple parameters: low limit

*a*, high limit

*b*, and likely value

*c*. The TriD’s Probability Distribution Function (PDF – a function that describes the likelihood of a random variable having any given value) shows the characteristic triangular shape, with the “likely value” having the highest chance of occurring. The chance of occurring drops to 0% at the high and low values. (Figure 1).

*a*,

*b*,

*c*) used to define the distribution are easily superimposed onto a budget line item: Lowest possible cost, highest possible cost, and most likely cost, respectively. This aids in easy comprehension of the cost inputs. Additionally, the triangular shape of the distribution is versatile enough to approximate forecasted cost for a variety of possible line items. For example, “time & material” line items can place the Likely Cost centrally between the Low and High Costs (roughly approximating a bell curve) – this accounts for the risk of either less or more T&M cost. For lump-sum and not-to-exceed budget items, the Likely Cost can be set closer to (or equal to) the High Cost, indicating low likelihood of exceeding the line item cost. If using expired vendor quotations to forecast future projects, the Likely Cost can be set equal to the Low Cost, indicating likelihood of future cost being higher than current benchmark data.Important Note: For each line item, the MCS user must carefully consider the Low, High, and Likely values. Obtaining early consensus on MCS inputs with vested internal parties can prevent questions and challenges in later project stages.Returning to the previous MCS input example (Table 1), Item A was assigned a low limit of $70,000 and a high limit of $85,000. To generate a TriD for Item A, the only missing input is assignment of the likely value. Assuming that a vendor has quoted Item A at $75,000, our TriD specification is complete (

*a*=$70,000,

*b*=$85,000,

*c*=$75,000). The only remaining hurtle is to transform the three TriD parameters into a randomly-generated cost data sample that takes on the triangular distribution when tabulated.Using a spreadsheet’s random number generator, infinite random decimal numbers between 0 and 1can be created. Kotz and van Dorp provide the equations necessary to transform this random variate into a triangular distribution with Low Value =

*a*, High Value =

*b*, and Likely Value =

*c*. The TriD transformed line item cost “CostTriD” for a random cost variable x in any iteration is given by Equations 1 and 2 (for 0 ≤ x ≤ 1):

*a*,

*b*,

*c*, and likelihood), along with two long columns of random numbers (

*x*and

*y*for each iteration), Formula 1 will generate the TriD data points for each line item. To transform this data set into a meaningful visual representation, the individual points must be sorted into a Cost Frequency Distribution – a graph showing how frequently the CostI results fall into various cost brackets. To accomplish this, create a column of 20 numbers: from the line item’s Low Value to High Value, with intervals of (High-Low)/20. This column will be the values of the x axis for the cost frequency distribution. To sort each randomly generated CostI into a cost frequency distribution bracket, a vlookup is performed:

**Important Note:**This MCS example was performed with 1,000 iterations for illustrative purposes. To increase statistical robustness of the MCS result, the MSC spreadsheet creator should consider using a very large number of iterations: targeting a minimum of 20,000. 50,000 iterations or more are desirable (but will result in a large filesize of the spreadsheet – not really a concern in this age of inexpensive storage).For the spreadsheet power-user, there is a method of transforming the standard TriD into a Modified TriD (TriMod) – similar to the TriGen function in the Excel @Risk add-on. The TriMod allows the user to specify a small chance of under- of over-running the budget line item. This is useful in cases where there is low confidence in the selected High and Low cost inputs. The TriMod calculations determine the new High and Low values that would be required to generate the desired percentages of area outside the original High and Low values. See Figure 4: The area of right-triangle θ represents the percentage of the TriMod triangle that falls outside of Low value a, and inside of new Low value m. The area of right- triangle Φ represents the percentage of the TriMod triangle that falls outside of High value b, and inside of new High value n.

*m*is expressed as a function of itself. By using a brute-force iterative guessing method, the value of

*m*can be calculated [ie, a table of 2 columns:

*m*and f(

*m*), where each new

*m*is guessed based on approaching the last calculated f(

*m*)], and entered into Equation 10 to obtain

*n*. By solving for

*m*and

*n*, the original (

*a*,

*b*,

*c*) of the TriD have been transformed into the (

*m*,

*n*,

*c*) of the TriMod. Figure 5 below shows an example a TriMod superimposed on its TriD basis:

*a*=$400,000;

*b*=$500,000;

*c*=$450,000; θ=3%; Φ=7%.

**References:***Kots, Samuel, and Johan Rene van Dorp. Beyond Beta – Other Continuous Families of Distributions with Bounded Support and Applications. Singapore: World Scientific Publishing Co, 2004. Print.***About the Author:***Keith Melchiors is currently a process engineer at Biogen Idec in Research Triangle Park, North Carolina. During the writing of this article, he was senior process engineer with Integrated Process Technologies in Cary, N.C. Keith earned his B.S. ChE from Purdue University in 2002, and has 10 years experience in biopharmaceutical process design and capital project execution.*

**Related article:**

*Quantifying Breakeven Point with Contingencies*