


|
 |
The articles you read in the Availability Digest result from years of
experience in researching and writing a variety of technical documents
and marketing content. Its what we do best, and we provide our services
to others who value high-quality content created by IT specialists.
Ask us
about
articles white papers case studies web content manuals
specifications patent disclosures |
|
Follow us

@availabilitydig |
|
|
System
Architecture and Risk Analysis - It's What We Do Best
Lets be realistic. Availability costs
money. Organizations may determine their requirements for system
availability based on expectations of performance, cost, and
downtime avoidance; but it is cost that usually is the primary
driver. Rarely do budgetary decisions take into consideration the
revenue losses, bad press, and customer dissatisfaction incurred
when an existing system goes down and provides no service at all.
Those are the most damaging costs, and some companies never recover.
We at the Availability Digest appreciate
costs both the expenses of operating a business and the revenues
lost when system budgets are too frugal. Helping a company balance
those two considerations is the service we bring as availability
specialists. With our years of experience in custom software
development and IT consulting, we are adept at discussing our
analyses with staff that possess varying degrees of technical
proficiency.
First things first. Our goal is to help
you architect your systems to provide the proper uptime and data
protection for your individual applications. To do so, we begin
with a risk analysis. During this study, we determine the importance
of each application to those who use it or for whom it is essential.
How much downtime is permissible (the RTO, or Recovery Time
Objective)? How much data loss is permissible (the RPO, or Recovery
Point Objective)? How is lost data reconstructed?
Based on our risk assessment, we
categorize your applications. For instance:
-
Non-critical applications
The loss of these applications will have no serious affect on
any company activity. The application could be down for days,
and days of data could be lost. Examples of such applications
are statistical reporting applications and business forecasting
applications.
-
Task-critical applications
If these applications should go down, certain employees in the
company will be inconvenienced and may have to resort to manual
procedures to complete their tasks. Lost data can be restored
manually from paper copies. Hours of downtime and data loss are
acceptable. Certain tasks may be suspended until the application
is restored. Examples are accounts payable and accounts
receivable.
-
Business-critical applications
The loss of these applications will prevent critical business
functions from being performed. Downtimes of an hour or less are
acceptable, with data loss measured in minutes. Payroll is an
example of such an application.
-
Mission-critical
applications If these applications
go down, important business functions that must be continuous
will be impacted. Downtime can be tolerated for minutes with
seconds of data loss. Customer call centers, customer-facing web
sites, and health care applications are examples.
-
Absolutely critical
applications These applications can
tolerate no downtime and virtually no data loss. Recovery time
and data loss should be measured in seconds. In some cases, no
data loss is acceptable. Examples are 911 systems, factory
control systems, and large Electronic Funds Transfer systems.
Once our risk analysis is complete, the
next step is to architect the appropriate systems for each
application. We will base our suggestions
on your current systems and how they can be rearchitected to
maximize availability, minimize data loss, facilitate your existing
operational knowledge and experience, and control the need for any
new equipment or staffing costs. Architectures may include:
-
Non-critical applications
A single server with magnetic-tape backup. Following an
outage, a new server may have to be provisioned, the database
loaded from magnetic tape, the production applications brought
up, and the system tested.
-
Task-critical applications
Applications are run on a production server, and available is
a cold backup that is probably doing other work. Database backup
is performed with virtual tape. Upon a production-system outage,
the applications on the backup server will have to be shut down,
the database loaded from virtual tape, the production
applications brought up, and the system tested.
-
Business-critical
applications Applications are run on
a production system with a warm backup. Applications are loaded
onto the backup system but do not have the database opened.
Production data is replicated in real time with a
data-replication engine. In the event of an outage, the backup
applications mount the backup database; and the backup system
continues the operations after being tested.
-
Mission-critical
applications Applications are run on
a production server with a hot backup. Applications are loaded
onto the backup system, and they have the database opened for
read/write activity. In the event of an outage, the backup
system is tested and takes over production operations.
-
Absolutely-critical applications
-
Applications are run in an active/active
architecture. Two (or more) nodes are actively processing
transactions for the same application. They each have their own
local application databases, and the databases in the
application network are kept synchronized via real-time data
replication. Should a node fail, all transactions simply are
routed to surviving nodes.
The applications must be rearchitected
in a controlled manner. As part of our consulting responsibilities,
we will monitor the rearchitecting of each application. Our
agreement with you includes the assurance that the required RTOs and
RPOs can be met. Of particular importance is the control of updates
in active/backup configurations to prevent configuration drift,
which might preclude a successful failover. Of equal importance is
automating active/backup failovers to the extent possible and
providing extensive operational training to ensure successful
failovers. All are competencies that we offer.
Availability costs money. So does
system downtime. For every application and for every system, there
is an affordable balance between the two. Determining the proper
balance for your company is what risk assessment and system
architecture is all about, and its what we do best.
For
further information and price quotations, contact Dr. Bill
Highleyman at billh@availabilitydigest.com.
billh@availabilitydigest.com.
|