Error Handling and Recovery in ZIO

Why Error Handling Matters

In traditional Scala code, errors are often invisible until runtime:

def fetchUser(id: Int): User = {
  // Can throw DatabaseException
  // Can throw NetworkException
  // Can return null
  // Can throw NullPointerException
  // The signature doesn't tell you any of this
  database.query(s"SELECT * FROM users WHERE id = $id")
}

Questions arise:

  • Which errors can this function throw?
  • Should I catch them?
  • What should I do if it fails?
  • How do I test error scenarios?

ZIO makes errors explicit, typed, and composable. Every error is tracked in the type system, making your code safer and more maintainable.

Error Types in ZIO

Remember ZIO[R, E, A]? The E parameter is your error channel.

Nothing - Cannot Fail

val infallible: ZIO[Any, Nothing, Int] = ZIO.succeed(42)

Nothing means this effect literally cannot fail. The compiler guarantees it.

Throwable - Any Exception

val risky: Task[String] = ZIO.attempt {
  scala.io.Source.fromFile("config.txt").mkString
}
// Task[A] = ZIO[Any, Throwable, A]

This can fail with any exception. It's honest but not specific.

Custom Error Types

sealed trait UserError
case class NotFound(id: Int) extends UserError
case class InvalidEmail(email: String) extends UserError
case class Unauthorized(userId: Int) extends UserError

def getUser(id: Int): IO[UserError, User] = {
  if (id < 0) 
    ZIO.fail(InvalidEmail(s"Invalid id: $id"))
  else if (!authorized(id))
    ZIO.fail(Unauthorized(id))
  else
    // ... fetch user or fail with NotFound
    ???
}

Now errors are documented, typed, and exhaustively checkable.

Typed Errors vs Exceptions

Traditional exceptions:

def divide(a: Int, b: Int): Int = {
  if (b == 0) throw new ArithmeticException("Division by zero")
  else a / b
}

// Caller has no idea this can throw
val result = divide(10, 0)  // Boom! Runtime crash

ZIO with typed errors:

def divide(a: Int, b: Int): IO[String, Int] = {
  if (b == 0) 
    ZIO.fail("Division by zero")
  else 
    ZIO.succeed(a / b)
}

// Compiler forces you to handle the error
val result: IO[String, Int] = divide(10, 0)
// Error is in the type signature - you can't ignore it

Basic Error Handling

catchAll - Recover from All Errors

val risky: IO[String, Int] = ZIO.fail("Something went wrong")

val recovered: UIO[Int] = risky.catchAll { error =>
  Console.printLine(s"Error occurred: $error") *>
  ZIO.succeed(-1)
}
// Converts IO[String, Int] to UIO[Int] - can't fail anymore

catchSome - Recover from Specific Errors

sealed trait DbError
case object ConnectionLost extends DbError
case object QueryTimeout extends DbError
case object InvalidQuery extends DbError

val query: IO[DbError, User] = ???

val handled = query.catchSome {
  case ConnectionLost => 
    Console.printLine("Reconnecting...") *> 
    retryQuery
  case QueryTimeout => 
    Console.printLine("Query too slow") *>
    ZIO.succeed(User.default)
  // InvalidQuery not handled - still in error channel
}

What happens if InvalidQuery occurs? It remains in the error channel for the caller to handle.

orElse - Try Alternative

val primary: Task[String] = fetchFromPrimaryDb
val backup: Task[String] = fetchFromBackupDb

val resilient = primary orElse backup
// If primary fails, try backup

orElseSucceed - Provide Default

val user: IO[String, User] = fetchUser(123)

val withDefault = user.orElseSucceed(User.guest)
// If fails, return guest user instead

The fold Method - Handle Both Cases

fold lets you handle success and failure at once:

def fetchConfig: IO[String, Config] = ???

val result: UIO[String] = fetchConfig.fold(
  error => s"Failed to load config: $error",
  config => s"Config loaded: ${config.version}"
)

foldZIO - Effectful Handling

val result: Task[String] = fetchConfig.foldZIO(
  error => Console.printLine(s"Error: $error").as("default-config"),
  config => Console.printLine(s"Success: $config").as(config.toString)
)

The difference? fold takes pure functions, foldZIO takes effects.

Error Transformation

mapError - Transform Error Type

sealed trait AppError
case class DatabaseError(msg: String) extends AppError
case class NetworkError(msg: String) extends AppError

val dbQuery: IO[String, User] = ???

val typed: IO[AppError, User] = dbQuery.mapError { msg =>
  DatabaseError(msg)
}

refineOrDie - Narrow Error Types

def readFile(path: String): Task[String] = 
  ZIO.attempt(scala.io.Source.fromFile(path).mkString)

// Only FileNotFound is expected, others are defects
val refined: IO[FileNotFoundException, String] = 
  readFile("data.txt").refineOrDie {
    case e: FileNotFoundException => e
  }

Other exceptions become "defects" (unexpected errors that crash).

absorb - Merge Error and Defect Channels

val effect: Task[Int] = ???  // ZIO[Any, Throwable, Int]

val absorbed: Task[Int] = effect.absorb
// Defects and failures are now treated the same

Error Channels Explained

ZIO effects have two failure modes:

1. Failure (E channel) - Expected, recoverable errors

val failure: IO[String, Int] = ZIO.fail("Expected error")

2. Defect - Unexpected bugs (like null pointer, stack overflow)

val defect: UIO[Int] = ZIO.die(new Exception("This shouldn't happen"))

Why the distinction?

  • Failures: Business logic errors you should handle
  • Defects: Programming bugs you should fix

Converting Error Channels

either - Convert to Either

val effect: IO[String, Int] = ???

val asEither: UIO[Either[String, Int]] = effect.either
// Never fails - errors become Left, success becomes Right

option - Convert to Option

val effect: IO[String, Int] = ???

val asOption: UIO[Option[Int]] = effect.option
// Never fails - errors become None, success becomes Some

Useful when you don't care about the error details:

val maybeConfig: UIO[Option[Config]] = loadConfig.option

val program = for {
  configOpt <- maybeConfig
  config = configOpt.getOrElse(Config.default)
  _ <- Console.printLine(s"Using config: $config")
} yield ()

Retrying Operations

Simple Retry - retryN

val flaky: Task[String] = fetchFromUnreliableApi

val resilient = flaky.retryN(3)
// Retry up to 3 times if it fails

Retry with Schedule

import zio.Schedule

// Retry with exponential backoff
val policy = Schedule.exponential(100.millis) && Schedule.recurs(5)

val resilient = flaky.retry(policy)
// Retries 5 times with increasing delays: 100ms, 200ms, 400ms, 800ms, 1600ms

Common Retry Schedules

// Fixed delay
val fixed = Schedule.fixed(1.second)

// Exponential backoff
val exponential = Schedule.exponential(100.millis)

// Fibonacci backoff
val fibonacci = Schedule.fibonacci(100.millis)

// Limited attempts
val limited = Schedule.recurs(10)

// Combine: exponential with max 5 attempts
val combined = Schedule.exponential(100.millis) && Schedule.recurs(5)

Conditional Retry

sealed trait ApiError
case object RateLimited extends ApiError
case object ServerError extends ApiError
case object InvalidRequest extends ApiError

val api: IO[ApiError, Response] = ???

val retryPolicy = Schedule.recurWhile[ApiError] {
  case RateLimited => true   // Retry rate limits
  case ServerError => true   // Retry server errors  
  case InvalidRequest => false // Don't retry bad requests
}

val resilient = api.retry(retryPolicy)

Practical Example: Resilient API Client

import zio._

sealed trait ApiError
case class NetworkError(msg: String) extends ApiError
case class InvalidResponse(status: Int) extends ApiError
case object RateLimited extends ApiError

case class User(id: Int, name: String)

object ApiClient {

  def fetchUser(id: Int): IO[ApiError, User] = {
    // Simulate flaky API
    ZIO.attempt {
      if (scala.util.Random.nextDouble() < 0.3)
        throw new Exception("Network hiccup")
      User(id, s"User-$id")
    }.mapError(e => NetworkError(e.getMessage))
  }

  def fetchUserResilient(id: Int): IO[ApiError, User] = {
    val retrySchedule = 
      Schedule.exponential(100.millis) && 
      Schedule.recurs(3)

    fetchUser(id)
      .retry(retrySchedule)
      .catchSome {
        case NetworkError(msg) =>
          Console.printLine(s"All retries failed: $msg") *>
          ZIO.fail(NetworkError("Exhausted retries"))
      }
  }

  def fetchUserWithFallback(id: Int): UIO[User] = {
    fetchUserResilient(id)
      .catchAll { error =>
        Console.printLine(s"Falling back to cache: $error") *>
        ZIO.succeed(User(id, "Cached-User"))
      }
  }
}

Combining Multiple Fallible Operations

Sequential Error Accumulation

def validateEmail(email: String): IO[String, String] = 
  if (email.contains("@")) ZIO.succeed(email)
  else ZIO.fail("Invalid email")

def validateAge(age: Int): IO[String, Int] = 
  if (age >= 18) ZIO.succeed(age)
  else ZIO.fail("Must be 18+")

def validateUser(email: String, age: Int): IO[String, (String, Int)] = 
  for {
    validEmail <- validateEmail(email)
    validAge <- validateAge(age)
  } yield (validEmail, validAge)

// If email is invalid, age is never validated

Parallel Validation

val validation = ZIO.collectAllPar(List(
  validateEmail("test@example.com"),
  validateAge(20)
))
// Both validations run in parallel
// Fails fast if any validation fails

Validation with All Errors

import zio.prelude.Validation

// Accumulates all errors instead of failing fast
def validateUserAll(email: String, age: Int): Validation[String, (String, Int)] = 
  Validation.validateWith(
    validateEmail(email),
    validateAge(age)
  )((e, a) => (e, a))

// Returns either all errors or success

Advanced Error Patterns

Timeout with Error

val slowOperation: Task[String] = ???

val withTimeout = slowOperation.timeout(5.seconds)
// Returns Option[String] - None if timeout

val withTimeoutError = slowOperation.timeoutFail(
  "Operation timed out"
)(5.seconds)
// Fails with error if timeout

Error Recovery with Logging

def logError[E](error: E): UIO[Unit] = 
  Console.printLine(s"Error occurred: $error").orDie

val operation: IO[String, Int] = ???

val withLogging = operation.tapError(logError)
// Logs error but keeps it in error channel

Ensuring Cleanup

val resource: Task[Resource] = acquireResource

val program = resource.bracket(
  release = r => r.close().orDie
)(
  use = r => r.process()
)
// Resource is released even if process() fails

Testing Error Scenarios

import zio.test._

object ErrorHandlingSpec extends ZIOSpecDefault {
  def spec = suite("Error Handling")(
    test("handles division by zero") {
      val result = divide(10, 0)
      assertZIO(result.either)(isLeft(equalTo("Division by zero")))
    },

    test("retries failing operation") {
      var attempts = 0
      val flaky = ZIO.attempt {
        attempts += 1
        if (attempts < 3) throw new Exception("Not yet")
        else "Success"
      }

      val result = flaky.retryN(5)
      assertZIO(result)(equalTo("Success"))
    },

    test("falls back to default") {
      val failing = ZIO.fail("Error")
      val withDefault = failing.orElseSucceed("Default")
      assertZIO(withDefault)(equalTo("Default"))
    }
  )
}

Real-World Example: Database with Retries

import zio._

sealed trait DbError
case class ConnectionFailed(reason: String) extends DbError
case class QueryFailed(sql: String) extends DbError
case object Timeout extends DbError

trait Database {
  def query(sql: String): IO[DbError, List[String]]
}

object DatabaseLive {
  val layer: ZLayer[Any, Nothing, Database] = ZLayer.succeed(
    new Database {
      def query(sql: String): IO[DbError, List[String]] = {
        val retrySchedule = 
          Schedule.exponential(100.millis) && 
          Schedule.recurs(3) &&
          Schedule.recurWhile[DbError] {
            case ConnectionFailed(_) => true
            case Timeout => true
            case QueryFailed(_) => false
          }

        performQuery(sql)
          .retry(retrySchedule)
          .timeoutFail(Timeout)(10.seconds)
          .tapError { error =>
            Console.printLine(s"Query failed: $error").orDie
          }
      }

      private def performQuery(sql: String): IO[DbError, List[String]] = 
        ZIO.attempt {
          // Actual database query
          List("result1", "result2")
        }.mapError(e => QueryFailed(sql))
    }
  )
}

// Usage
val program = for {
  db <- ZIO.service[Database]
  results <- db.query("SELECT * FROM users")
    .catchSome {
      case Timeout => 
        Console.printLine("Query timed out, using cached results") *>
        ZIO.succeed(List("cached"))
    }
  _ <- Console.printLine(s"Results: $results")
} yield ()

Best Practices

1. Use Custom Error Types

// Good: Specific, typed errors
sealed trait UserError
case class NotFound(id: Int) extends UserError
case class InvalidEmail(email: String) extends UserError

// Avoid: Generic string errors
def getUser(id: Int): IO[String, User] = ???

2. Fail Fast for Programming Errors

// Use defects for bugs
def process(data: Data): UIO[Result] = {
  if (data == null) 
    ZIO.die(new NullPointerException("Data cannot be null"))
  else
    // ... normal processing
    ???
}

3. Don't Over-Catch

// Good: Handle specific errors
operation.catchSome {
  case NetworkError => retry
}

// Avoid: Catching everything hides bugs
operation.catchAll(_ => ZIO.unit)

4. Document Error Cases

/**
 * Fetches user from database.
 * 
 * @return IO[UserError, User] where error can be:
 *         - NotFound: User doesn't exist
 *         - DatabaseError: Connection failed
 *         - Unauthorized: No permission to access
 */
def getUser(id: Int): IO[UserError, User] = ???

Key Takeaways

  • Typed errors make failures explicit and compile-time checked
  • catchAll and catchSome recover from errors
  • fold and foldZIO handle both success and failure
  • mapError transforms error types
  • retry with schedules implements resilient operations
  • either and option convert errors to values
  • Error vs Defect: failures are expected, defects are bugs
  • Compose error handling just like success cases

Common Pitfalls

Swallowing errors:

// Bad: Error is lost
operation.catchAll(_ => ZIO.unit)

// Good: Log or handle appropriately
operation.catchAll(e => Console.printLine(s"Error: $e") *> ZIO.unit)

Not using typed errors:

// Weak: String errors
def process(): IO[String, Result] = ???

// Strong: ADT errors
sealed trait ProcessError
def process(): IO[ProcessError, Result] = ???

Over-retrying:

// Bad: Infinite retries can hang your app
operation.retry(Schedule.forever)

// Good: Bounded retries
operation.retry(Schedule.recurs(5))

What's Next?

You now understand ZIO's powerful error handling system. In Lesson 3: ZIO Environment and Dependency Injection, you'll learn how to:

  • Manage dependencies without frameworks
  • Use ZLayers to compose services
  • Access environment services
  • Mock dependencies for testing
  • Build modular, testable applications

Ready to master dependency injection the functional way? Let's continue!

Additional Resources