Replication Guide and Reference
This chapter introduces the key replication tasks that
you perform at various stages in the replication process. The tasks are
grouped into these major stages:
- Planning your replication requirements
- Setting up your replication environment
- Operating in your replication environment
After you read this chapter, go to Administration for detailed information about these tasks. Also, see
Operations for specific information about using the Capture and Apply
programs on particular operating systems.
An important step in coming up with the appropriate replication environment
is determining the characteristics of your application data, who needs to
access the data, and how frequently they need to access it.
You can use DB2 data replication to maintain data in more than one location
and keep the various copies of it synchronized. You must determine
where your source data will be coming from. You must decide whether you
want all or some of the source information copied, or whether you want only
changes copied, and how many copies (or targets) you need. You also
need to determine where the copies will be located.
Although you cannot update the source tables and target tables
synchronously, you can schedule the updates to meet the needs of your
applications and your replication environment. The frequency of
replication depends on how much lag time is acceptable between the time that
the source is updated and the time that the targets are updated.
Therefore, you must decide how synchronized the copies must be with the source
and with each other before you can come up with a replication model.
After you understand your application data requirements, you can design the
replication model that will help you meet those requirements. There are
many facts that you need to consider when you design your model. These
are some of the more important decisions that you need to make:
- The replication configuration
- Based on your data needs, you must decide whether you need a
consolidation, distribution, update-anywhere, or occasionally connected
configuration. You have the flexibility to design your environment so
that it uses one of these configurations or some combination of them.
- Where to locate the control server
- You will get slightly better performance if you place the control tables
on the same server as the Apply program instead of placing them centrally,
because the Apply program frequently reads the control tables at the control
server. You can have your Apply programs share a single control server
so that your control information is stored centrally. The control
server can be located at the source server, the target server, or any database
server that the Apply program can connect to. A central control server
is popular because it simplifies the administration of large networks, but it
has two drawbacks: the Apply program must access the control information
over the network and, if the control server goes down, all of the Apply
processes are affected. However, if the source server is in a secure
environment, locating the control server at the source server can improve
security and let you manage and monitor replication subscriptions
centrally.
- The type of target tables to use
- The type of target table that you use depends on your replication
requirements. Each type is best suited for specific situations.
For example, a replica is the only type of target table that you can use for
update-anywhere replication; and a row-replica is the only type of target
table you can use with DataPropagator for Microsoft Jet.
- Whether to use existing target tables
- You can let the administration interface create the target table for you
or you can use an existing table as a target. If the existing tables
are DB2 tables, the data types are supported by the DB2 data replication
components. If your replication environment includes non-IBM databases,
some of the data types might not map directly to the source tables that you
are using.
- Which columns to make available for replication
- You can choose to capture only the after-image column values or both the
before-image column values and the after-image column values. If you
will be using the targets for auditing purposes, or if you have replica target
tables, you must copy both the after-image and before-image column
values.
- How to capture SQL operations
- You might want to capture all updates as two rows in the CD table or in
the CCD table of a non-IBM source: a DELETE of the before-image column
values followed by an INSERT of the after-image column values. This
includes updates of columns that will be the primary key of the target,
columns that will be the partitioning key of the target, or columns that are
part of the WHERE clause or predicate of the subscription set. You
might need to adjust the size of the CD table to accommodate this increased
overhead.
- The level of constraints
- You must use referential constraints to enforce referential integrity
only if you have target tables that are replica tables. If
you have a read-only table, you do not need to set constraints at the
target. The referential integrity of other types of target tables is
ensured if you define your subscription sets appropriately.
- Which joins to use
- Joins are described in views, which in turn are defined in replication
sources. For example, you might use a view to change the name of copied
columns, to reference columns from related tables in the WHERE clause in your
subscription member predicate, to incrementally maintain copies that are inner
joins of two or more tables, or to replicate information from one table when
an update is made to another table.
When you are ready to plan your replication environment, see Planning for replication for detailed planning information.
After you design the replication model, you must set up your
replication environment. These steps are involved in setting up your
replication environment:
- Setting up the system
- Defining the replication criteria
- Performing the initial replication
The rest of this section introduces the steps involved in setting up your
environment.Setting up your replication environment contains detailed instructions on setting up your
replication environment.
To set up the system, you perform the following steps:
- Migrate from previous releases of DataPropagator products.
- Grant access to the proper user IDs.
To set up the replication criteria, you perform the following
steps:
- Configure the administration tool. For example, if you are using
DJRA, you need to associate passwords with databases.
- Customize and create replication control tables.
- Customize change data (CD) tables. This step is optional.
You can change the default name and table space of your CD tables. If
you are using the DB2 Control Center, you must customize your CD tables
before you define a replication source. If you are using the
DJRA tool, you customize the CD tables when you define the replication
source.
- Define replication sources. This step includes identifying the
table or view from which you want data copied and the types of changes that
you want captured.
- Define subscription sets and subscription-set members. This step
includes associating the replication source with the target to which you want
the changes replicated. You can define subscription sets and
subscription-set members at any time prior to starting the Apply
program.
- Configure the Capture program. This step includes enabling the
source server for logging; it also includes creating and binding the
Capture program package to the source server.
- Configure the Apply program. This step includes creating and
binding the Apply program package to the source server; the target
server, and the control server, it also includes creating and binding the
Apply program to the target server.
11
Important: When you set up your
replication environment, you must start the Capture program and let it
initialize fully before you start any Apply programs.
To perform the initial replication, you must perform the following steps in
the exact order:
- Make sure that at least one replication source is defined.
- Start the Capture program. This step includes specifying invocation
parameters (such as NOPRUNE, which prevents automatic pruning of the CD and
UOW tables). After the Capture program is fully initialized, it will
not capture any changes until the Apply program signals it to do so.
- If you haven't already done so, define at least one subscription set
and one subscription set member.
- Start one or more Apply programs. This step includes specifying
invocation parameters (such as LOADX, which calls ASNLOAD--an exit
routine to initialize target tables). Each Apply program will perform a
full-refresh copy for all subscription-set members and the Capture program
will begin capturing changes for the associated replication
sources.
12
Tip: | Use the WARMNS option in the Capture program if you want to be able to repair
any problems (such as unavailable databases or table spaces) that might
prevent a warm start from occurring.
|
You probably need to add replication sources and subscription sets to
your replication environment from time to time.
To add to your replication environment, you must perform the following
steps in the exact order:
- Define the new replication source.
- Run the Capture reinit command, or stop the Capture program and
warm start it.
- Define the new subscription sets and subscription set members.
- The Apply program will automatically recognize the new subscription set if
the Apply program is already running and it uses the Apply qualifier that is
associated with the new subscription set. Otherwise, you must start a
new Apply program using the appropriate Apply qualifier before the Apply
program can recognize the new subscription set.
After you define your replication environment on one system (for
example, a test system), you can copy the replication environment to another
system (for example, a production system). You use the promote
functions to reverse-engineer your tables, replication sources, and
subscription sets and to create a script file with the appropriate data
definition language (DDL) and data manipulation language (DML). For
more information about the promote functions, see Copying your replication configuration to another system and the on-line help for the administration
interface.
After your replication environment is up and running and updates are
replicated, you need to perform periodic maintenance tasks. These
include the following tasks:
- Configuring the pruning of control tables
- The UOW and CD tables will grow too large if the contents are not pruned
regularly. You can configure your system to prune automatically, or you
can prune manually. You control how frequently obsolete information
will be removed from these tables. If the tables aren't pruned
often enough, the table space that they're in will run out of space,
which will force the Capture program to stop. If they are pruned too
often or during peak times, the pruning interferes with the change capture
process. You can use the optimal pruning frequency for your replication
environment.
- Monitoring important criteria
- Many factors determine how well your replication environment
performs. You can use the Replication Monitor, which is part of DJRA,
to generate a report that will help you monitor the activities of the Capture
and Apply components, as well as the status of the subscription sets.
For example, the report contains historical information to help you determine
trends about subscription latencies.
- Dealing with data modification conflicts
- If you are using update-anywhere replication, and you did not design your
configuration to prevent update conflicts, you must handle update conflicts
and rejected transactions.
- Performing regular database maintenance
- If you want your replication environment to run smoothly, you must
regularly perform database maintenance tasks. For example, use the
RUNSTATS utility against the DB2 catalog tables to collect new statistics for
tables and indexes. Also use the RUNSTATS utility once after the CD and
UOW tables have sufficient data in them so that the DB2 Optimizer will use
indexes on them. Periodically use the REORG utility (or the RGZPFM
command in AS/400) for the change data tables, the unit-of-work table, and the
target tables. You must also delete rows from the Apply trail table,
which contains subscription set statistics and error information.
- Coordinating with DB2 utility operations
- If you want to run DB2 utilities (such as REORG, RUNSTATS, BIND PACKAGE,
and REVOKE) that will use the table spaces that contain the replication
control tables, you must stop the Capture and Apply programs before running
the utilities.
- Changing your replication configuration as your business needs change
- You are likely to need to modify your replication environment from time to
time. Whether you add a new column to an existing source table, or drop
a source table, you will need to modify your replication criteria.
Also, you will need to maintain password files. For more information
about modifying your replication configuration, see Modifying your replication configuration.
- Troubleshooting
- If you find that your replication environment is not performing as you
expected, or if you can't replicate data, you can run the Replication
Analyzer. The Replication Analyzer is a tool that is packaged with DB2
Universal Database and the DataJoiner Replication Administration tool.
You can use the Replication Analyzer to analyze the behavior of the Capture
program or the Apply program. It can answer such questions as:
"why is the Capture program not capturing?" and "why is the Apply program not
applying?" The Replication Analyzer can help diagnose problems, verify
replication setup, and offer suggestions for performance tuning. You
can also look in the Apply trail table for status information about the Apply
program, or in the Capture trace table for status information about the
Capture program. For details see Problem determination.
For general information about operating in a replication environment, see Operating DB2 DataPropagator. For information about operating in a particular
operating system, see the appropriate chapter in Operations.
Footnotes:
- 11
-
If the Capture program and the Apply program are not on OS/390, they will
automatically bind.
- 12
-
If you use non-IBM load utilities, it is recommended that you use the offline
load feature in DJRA. For more information on setting up the offline
load feature with DJRA, see Loading target tables offline using DJRA.
[ Top of Page | Previous Page | Next Page ]