cdc using aws dms ๐Ÿ’พ ๐Ÿ“

ยท

1 min read

aws dms

  • powerful tool for implementing change data capture (cdc)

  • data migration task within DMS needed for:

    • initial data load to ensure synchronization (before cdc, transfer initial snapshot of source db to target db)

    • ongoing replication (dms task configured to continuously capture changes from source db and apply to target db)

    • flexible config (aws dms allows for custom replication process e.g. filter tables/transformation to data/error handling)

  • key components of aws dms

    • replication instance (construction worker) - compute resource (e.g. ec2 instance) that powers the migration process. handles connection to source and target databases, applies transformations, routes the data to the target, security groups + network access

    • endpoint - defines connection details for specific database. db type + hostname + port + username + password

    • dm task (blueprint) - config defining specific migration/replication job. specifies src and target endpoints + replication method (full load/cdc) etc.

process flow

  • change occurs in src db, dms replication instance captures change

  • replication instance monitors src db continuously, extracts the changes, sends it to kinesis data stream

  • kinesis data stream acts like a pipeline, receives change data, stores it in shards

  • application code (for e.g) consumes data from kinesis, does transformation, and can directly interact with target db to store the processed data

ย