Quick tour

Read the theory

The full lesson text is below. If you watched the video, the text starts collapsed — expand it anytime for a deeper dive.

Resilience: planning for what goes wrong

A business that requires everything to go perfectly is fragile. Planning for failure — building backups, redundancy, and recovery plans — means your company survives disruptions that destroy competitors.

Lesson 5/5OPERATIONS7 min read

Read

Full lesson text

Why this matters?

Two restaurants face the same problem: their main supplier cannot deliver this week.

Restaurant A has no backup. They scramble to find alternatives at the last minute, pay premium prices, and still run out of ingredients. Customers leave disappointed.

Restaurant B has relationships with three suppliers. They shift the order to their secondary source within an hour. Customers notice nothing. Business continues.

Same crisis. Completely different outcomes.

The pattern: Resilience is the ability to absorb shocks and continue operating. It comes from anticipating problems before they happen and building systems that can handle them.

The mindset: Instead of hoping nothing goes wrong, assume things will go wrong. Then design accordingly.

1. Single points of failure

A single point of failure (SPOF) is any element whose failure would stop the entire operation.

Common SPOFs in business

The only person who knows a critical process
The only supplier for a key input
The only copy of important data
The only access credentials for a system
The only client that generates most revenue

Finding SPOFs

For every critical process, ask: "If this one thing fails, what happens?"

If the answer is "everything stops," you have identified a single point of failure.

Eliminating SPOFs

Every SPOF needs a backup:

Document knowledge so it is not trapped in one head
Maintain relationships with multiple suppliers
Back up data to multiple locations
Share credentials securely
Diversify revenue sources

2. Redundancy

Redundancy means having backups for critical elements. It costs more upfront but prevents catastrophic losses.

Where redundancy matters

Focus redundancy on what would hurt most if it failed:

Revenue-generating systems
Customer-facing operations
Data and intellectual property
Key relationships

Not everything needs backup. Low-impact, easily replaceable elements do not justify the cost.

Levels of redundancy

Hot standby: Backup is running and ready to take over instantly
Warm standby: Backup exists but needs some setup time
Cold standby: Components available but require significant time to deploy

Match the level of redundancy to the criticality and acceptable downtime.

3. Business continuity planning

A business continuity plan (BCP) defines how operations continue during and after a disruption.

Elements of a BCP

Risk identification

What could go wrong?

Technology failures
Supply chain disruptions
Key person unavailability
Natural disasters
Cyberattacks
Economic shocks

Impact assessment

For each risk, evaluate:

How likely is it?
How severe would the impact be?
How long could operations survive?

Focus planning on high-likelihood and high-impact risks.

Response procedures

For priority risks, document:

Who is responsible for the response?
What steps are taken immediately?
How is communication handled?
What resources are needed?

Recovery steps

After the immediate response:

How do you return to normal operations?
What needs to be repaired or replaced?
How do you learn from the incident?

4. The premortem

A premortem is a planning exercise that imagines failure before it happens.

How it works

Instead of asking "How will this succeed?" ask "It is six months from now and this failed completely. What went wrong?"

This reframes planning from optimism to realism. People are more willing to identify risks when framed as explaining past failure rather than predicting future problems.

Running a premortem

Describe the scenario: "The project has failed. We lost time and money."
Ask each person: "What happened? Why did it fail?"
Collect all the reasons without judgment
Prioritize the most likely and most damaging
Create preventive actions for priority risks

When to use

Run premortems before major initiatives, new projects, or significant changes. It surfaces risks that optimism blinds you to.

5. Graceful degradation

When failure occurs, systems should degrade gracefully rather than collapse completely.

What graceful degradation looks like

A website that shows a "we're experiencing high traffic" message instead of crashing completely
A service that continues with reduced features rather than total outage
A supply chain that shifts to secondary sources with slightly higher costs rather than stopping

The goal is not to prevent all impact but to limit the severity.

Designing for degradation

For critical systems, define:

What is the minimum viable function?
What can be temporarily disabled?
What manual workarounds exist?

Build these fallbacks into the design rather than improvising during a crisis.

6. Recovery and learning

After a disruption, the goal is not just to return to normal but to become stronger.

Post-incident review

After any significant disruption, analyze:

What happened exactly?
What worked in the response?
What did not work?
What would we do differently?

Document the findings. This creates institutional memory that improves future responses.

Turning failure into improvement

The best organizations use failures as forcing functions for improvement. Each incident reveals weaknesses that might never have been discovered otherwise.

"Never waste a crisis" means using disruption as an opportunity to fix underlying problems.

Building a resilient culture

Resilience is not just systems — it is mindset. Teams that expect challenges and prepare for them respond better than teams that assume smooth sailing.

Normalize discussion of risks. Reward preparation over optimism. Celebrate recovery as much as success.

Think

What would you do in these scenarios?

Simulator

Sim_v4.0.exe

The Coffee Shop Expansion

You are the manager of a successful local coffee shop. A large international chain is opening a store just across the street. How do you respond to maintain your market position?

Practice

Test yourself and review key terms

Knowledge check

Q1/1

What is the primary indicator of a successful Market Expansion Strategy?

Concepts

Question

What should you do after a major disruption, beyond fixing the immediate damage?

Show answer

Answer

Analyze what worked and what did not, then document the findings — this institutional memory improves future responses.

1 / 14

Apply

Your action steps for today

01
Identify one single point of failure in your business. What would happen if that person, supplier, system, or client disappeared tomorrow? Start building a backup this week.
02
Run a mini-premortem. Think of an upcoming project or initiative. Imagine it failed spectacularly. Write down three reasons why it might have failed. Address the biggest one.
03
Check your data backups. Do you know where your critical files are backed up? Could you recover if your laptop was destroyed today? If not, set up a backup system now.

Finish

You made it through this lesson

Thank you!

Your feedback helps us improve. We appreciate the time you took to share your thoughts!

What's next

Lesson 5 of 5 complete

Start final exam →

← Previous All lessons

Note

Some examples and details may be simplified to better convey the core idea. Every business is different — adapt these ideas to your specific context and situation.

CORE MBA