Scenario 3—You’re the database administrator for a large grocery chain. When you leave on
Wednesday, there are no problems. When you arrive on Thursday—the day a new sale starts—
you learn that the DSL lines are down. They went down before the local stores could download
the new prices. All scanned goods will ring up at the price they were last week (either sale or
regular) and not at current prices. The provider says it’s working on the DSL problem but can’t
estimate how long repairs will take. How do you approach the problem?
Just like in the real world, there are no right or wrong answers for these scenarios. However, they
all represent situations that have happened and that administrators planned for ahead of time.
High availability refers to the process of keeping services and systems operational during an out-
age. In short, the goal is to provide all services to all users, where they need them and when they
need them. With high availability, the goal is to have key services available 99.999 percent of
the time (also known as five nines availability).
There are several ways to accomplish this, including implementing redundant technology,
backup communications channels, and fault-tolerant systems. A truly redundant system won’t
utilize just one of these methods, but rather some aspect of all of them. The following sections
address these topics in more detail.
Redundancy refers to systems that are either duplicated or that fail-over to other systems in the
event of a malfunction. Fail-over refers to the process of reconstructing a system or switching
over to other systems when a failure is detected. In the case of a server, the server switches to a
redundant server when a fault is detected. This allows service to continue uninterrupted until
the primary server can be restored. In the case of a network, processing switches to another net-
work path in the event of a network failure in the primary path.
Fail-over systems can be very expensive to implement. In a large corporate net-
work or e-commerce environment, a fail-over might entail switching all process-
ing to a remote location until your primary facility is operational. The primary site
and the remote site would synchronize data to ensure that information is as up-
to-date as possible.
Many newer operating systems, such as Linux, Windows 2000 Advanced Server, and Novell
NetWare 6, are capable of clustering to provide fail-over capabilities. Clustering involves mul-
tiple systems connected together cooperatively and networked in such a way that if any of the
systems fail, the other systems take up the slack and continue to operate. The overall capability
of the server cluster may decrease, but the network or service will remain operational.