Beyond Gut Feelings: Data-Driven Software Development Forecasts

For nearly a decade now I’ve experienced countless Big Room Planning type events viewing PowerPoint slide after slide of Project Managers, Scrum Masters, Team Leads or whoever is presenting state, over and over, “We’ll deliver by the end of Q3, no problem!”

Even companies who aren’t scaling have their own versions of this scenario.

When inevitably things don’t work out, even more time is spent manufacturing excuses and defending the team with the promise to “do better next time”.

Traditional estimating methods often rely on expert opinions and optimistic timelines. These techniques fail to incorporate the inherent complexities and uncertainties of developing software.

The result? Missed deadlines, frustrated stakeholders, and a growing lack of management trust in the team.

What if there was a more reliable way to answer the ever-present question “when will it be done?”

What if we could move beyond gut feelings and embrace a data-driven approach that acknowledges uncertainty and provides a clearer picture of potential outcomes? (note the plural “outcomes”)

The answer lies in the usage of Lean metrics and Monte Carlo simulations. This combination transforms software delivery forecasting from a guessing game to a more predictable science.

The Challenges of Traditional Estimates

For the 40+ years I’ve been in IT software delivery timelines continue to represent a mix of wishful thinking and high-level estimates that don’t stand up to reality. Then throw in the pressure to commit to arbitrary dates. Dates that are often management promises generated without team level input.

These traditional methods suffer from several critical flaws:

  • Over-reliance on Expert Opinion: While experience is valuable, individual estimates are often influenced by cognitive & optimimism biases and a tendency to underestimate complexity. A developer might estimate a task in isolation, neglecting potential roadblocks and dependencies.
  • The Illusion of Certainty: Gantt charts and project plans present a linear, deterministic view of progress. These plans fail to account for the inherent variability in software development. Unforeseen complexity, emergent requirements and ubiquitous dependencies are realities that traditional estimates fail to capture.
  • Ignoring Historical Data: Organizations often fail to leverage their own past performance to inform future forecasts. Each effort is treated as being unique, ignoring valuable patterns that could provide crucial insights.
  • The “Hockey Stick” Effect: Projects often show little progress initially followed by an all-hands-on-deck high pressure push as the deadline approaches. This leads to compromised quality and burnout. This pattern is a clear indicator of unrealistic initial estimates.
  • Watermelon Status Reports: Associated with the above point, status reports indicate a “Green” project until reality catches up and the true status of “Red” can no longer be hidden. Then the hockey stick happens.

These challenges have predictable consequences. Senior management struggles with strategic planning and people allocation when delivery timelines are unreliable. Front-line managers and teams face constant pressure to meet unrealistic deadlines, leading to stressed teams and potentially lower quality. Stakeholder trust erodes when promises are consistently broken.

Yet companies continue to repeat the process – doing what they’ve always done. “We’ll do better next time.”

Introducing Lean Metrics for Forecasting

The lean philosophy, with its focus on flow and continuous improvement, offers a grounded approach to understanding and predicting software delivery. Key lean metrics provide insights into the actual rhythm and efficiency of the development process:

  • Cycle Time: This measures the time it takes for a single work item to move from the moment work begins on it until it is completed. Analyzing cycle time provides insights into the typical duration of tasks and highlights potential bottlenecks in the workflow. For instance, consistently long cycle times for certain types of tasks might indicate process inefficiencies or dependencies. Cycle Time is very useful forecasting single item completion.
  • Lead Time: This measures the time it takes for a single work item to move from the moment it is added to the backlog until it is completed. Analyzing lead time provides insights into what the customer, or stakeholder, experiences. Lead Time is often referred to as “concept to cash” – a phrase made popular by the Poppendieck’s (Mary and Tom). Lead Time is very useful forecasting multiple item completion.
  • Throughput: This metric tracks the number of work items completed within a specific timeframe such as stories per week or features per month. Tracking throughput trends tells you if the team’s delivery is stable, improving, or getting worse. Trends are more important than single values.
  • Work in Progress (WIP): WIP refers to the number of work items the team has started but have not finished. High WIP can lead to context switching which reduces focus and causes longer cycle times. By visualizing and limiting WIP, teams can improve flow and increase throughput. Tools like Kanban boards are perfect for visualizing WIP and assessing flow.

Visualizing these metrics through tools like Cumulative Flow Diagrams (CFDs) provide useful insights. CFDs illustrate the rate of arrival, progress, and departure of work items, revealing bottlenecks, process inefficiencies and trends in the above mentioned lean metrics.

By understanding patterns in the CFD we move away from subjective estimates to a data-informed measure of our delivery process. These measures enable data-driven forecasting.

Understanding Monte Carlo Simulations

While lean metrics provide the foundation, Monte Carlo simulations introduce uncertainty into our forecasts. Uncertainty is one of the main reasons Traditional estimates so often fail.

Named after the famous casinos, these simulations don’t predict a single outcome found in Traditional estimating. Instead, teams run thousands of possible scenarios based on the historical data, generating a range of potential completion dates and their associated probabilities. Hence, “probabilistic forecasting”.

You already experience Monte Carlo simulations but they are not referred to by that name. If you’ve ever changed plans based on a “70% chance of rain on Saturday” you are adapting to a probabilistic forecast created using thousands of weather simulations using Monte Carlo forecasting.

The core principle is simple. Instead of relying on a single value estimate for how long a task will take, we use the historical distribution of our lead times. For example, if our historical data shows that user stories take between 20 and 30 days from concept to cash, with certain probabilities for each duration, the Monte Carlo simulation will randomly sample from this distribution for each work item in our backlog.

By running this sampling process thousands of times, the simulation generates a probability distribution of possible completion dates for the entire project or a specific set of work items. This allows us to answer questions like:

  • “What is the probability of delivering by the end of Q3?”
  • “What is the range of likely completion dates?”
  • “What is the 85th percentile confidence interval for our delivery?”

This probabilistic approach incorporates the inherent uncertainty in software development. The result is a much more realistic view of potential outcomes than a single, often optimistic, date.

Combining Lean Metrics with Monte Carlo

The value lies in combining lean metrics with the predictive capabilities of Monte Carlo simulations. Here’s how you can apply these techniques in your environment:

  1. Collect Lean Metrics: Start by collecting historical data on Cycle Time, Lead Time and Throughput for your team or team of teams if they are working on a common effort. Your ALM (Agile Lifecycle Management) tool likely already does this for you. The more data you have, the more accurate your simulations will be. Ensure the data is for the current team and process.
  2. Define the Scope: Identify the scope of the work you want to forecast such as a release or a set of features. Break down the work into smaller, manageable items.
  3. Calculate Remaining Work (in Units of Throughput): Instead of estimating in story points or ideal days, count the remaining work items (such as user stories or features).
  4. Run the Simulation: Use the historical Lead Time or Throughput data as input for the Monte Carlo simulation. Many tools, such as popular spreadsheet programs with add-ins, and many ALM tools, include Monte Carlo simulation features.
    • Using Lead Time: If using Lead Time, for each remaining work item, the simulation will randomly draw a lead time from your historical distribution. By summing these simulated lead times, it calculates a potential completion date. This happens thousands of times to generate the probability distribution.
    • Using Throughput: If using Throughput, the simulation will randomly sample historical throughput values for a given period. By dividing the remaining work by these simulated throughput values, it calculates a potential completion duration. This also happens many times.
  5. Interpret the Results: The output of the simulation will be a probability distribution showing the likelihood of completing the work by different dates. Focus on understanding the probabilities associated with key delivery dates and the range of potential outcomes. For example, the simulation illustrated below indicates an 87% probability of delivering on October 3rd, 2018.
    Nave Software – Monte Carlo Simulation

    Benefits for Senior Management

    For senior leaders, adopting this data-driven forecasting approach offers significant advantages:

    • Improved Strategic Planning: Instead of relying on potentially overly optimistic delivery estimates, senior management can make more informed strategic decisions based on realistic probability ranges for delivery. This allows for better alignment of product roadmaps with business objectives.
    • Better Resource Allocation: Understanding the likelihood of different delivery scenarios enables more effective resource planning & budgeting. Leaders can anticipate potential delays and proactively adjust team capacity or project scope.
    • Enhanced Stakeholder Communication: Communicating delivery expectations based on probabilities fosters greater transparency and trust with stakeholders. Instead of a single, fixed date that will likely be missed, presenting a range of likely outcomes manages expectations more effectively.
    • Reduced Risk of Missed Deadlines: By acknowledging and quantifying uncertainty, organizations can proactively mitigate risks and make informed decisions to improve the probability of on-time delivery.

    Benefits for Teams and Development Managers

    Both teams and front-line development managers gain:

    • More Effective Planning: Understanding historical throughput and cycle times can inform more realistic short-term (like Sprint) capacity and improve the predictability of delivery.
    • Better Identification of Bottlenecks: Analyzing cycle time data, using tools like a CFD, can highlight areas in the workflow where work is consistently delayed. You can better identify the constraint. This helps teams and managers focus improvement efforts at the constraint.
    • Data-Driven Decision-Making: Instead of relying on gut feelings & upfront estimation during scope discussions or prioritization, Product Owners and managers can use simulation results to understand the impact of adding or removing work items on the delivery forecast.
    • Improved Team Morale: By utilizing realistic probability ranged forecasts rather than single, fixed date deadlines, teams reduce the likelyhood of Product/Project “Death Marches” as that date approaches.

    Common Objections

    Adopting new approaches often faces resistance. Some common objections and how to address them:

    • “We don’t have enough data.” Start collecting data now. You may already have this information within your ALM tool. Even a few months of consistent data collection can provide valuable insights. Begin with key metrics like cycle time and throughput and then expand. For new teams or projects, you can use data from similar past projects as a starting point and refine as you gather additional data.
    • “Our projects are too unique.” While every effort has its unique characteristics, there are likely patterns in your team’s workflow and delivery capabilities. Focus on the process rather than the specific product features. Look for trends in cycle times for similar types of work.
    • “It’s too complicated.” While the term “Monte Carlo Simulation” might sound like technical wizardry, you already know it by examples such as weather forecasts. Many tools simplify the process. Start with understanding the basic concepts and then explore the capabilities of different tools. Focus on the value improved predictability brings and what that unlocks in your organization.

    Embracing Predictability

    In my 40+ years of software development experience there is one thing I can say for sure – absolute certainty is an illusion.

    By moving beyond gut feeling estimates and embracing a data-driven approach that leverages the power of lean metrics and Monte Carlo simulations, we improve our ability to forecast software delivery.

    This shift towards probabilistic thinking provides senior management information which leads to better strategic decisions.

    No forecast is perfect. However, the increased predictability and transparency lowers anxiety over product delivery timeframes. Less stress at all levels improves morale and grows trust across the organization and with your customers.

    Put away the estimation tools which have proven over decades to be inadequate for software development. Leverage the data you already have to generate better strategic delivery insights.

    Until next time!

    Total
    0
    Shares
    Prev
    Work Item Age and ….. Pizza? I’ll Take a Slice of That!

    Work Item Age and ….. Pizza? I’ll Take a Slice of That!

    Addressing aging work items is one of the simplest and effective ways to improve

    Next
    A Key Software Development Metric You Are Likely Overlooking

    A Key Software Development Metric You Are Likely Overlooking

    Aging work-in-progress is a silent, but potent, threat to the efficiency and

    You May Also Like
    Total
    0
    Share