DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Fixing Common Oracle Database Problems
  • OpenShift Data Foundation on IBM Cloud Using Terraform, Part 1: Deployment
  • Develop With OCI Real-Time Speech Transcription and Oracle Database NL2SQL/Select AI To Speak With Your Data
  • Streamlining Data Integration

Trending

  • Contextual AI Integration for Agile Product Teams
  • AI, ML, and Data Science: Shaping the Future of Automation
  • Power BI Embedded Analytics — Part 2: Power BI Embedded Overview
  • Developers Beware: Slopsquatting and Vibe Coding Can Increase Risk of AI-Powered Attacks
  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. How To Replicate Oracle Data to BigQuery With Google Cloud Datastream

How To Replicate Oracle Data to BigQuery With Google Cloud Datastream

Set up Oracle 19c to BigQuery replication using Google Cloud Datastream, covering prerequisites, networking, CDC setup, and job validation.

By 
Karteek Kotamsetty user avatar
Karteek Kotamsetty
·
Justin Taras user avatar
Justin Taras
·
Apr. 29, 25 · Tutorial
Likes (3)
Comment
Save
Tweet
Share
17.5K Views

Join the DZone community and get the full member experience.

Join For Free

This technical guide outlines the steps to set up data replication using Google Cloud Datastream. Specifically, it details the process of setting up data replication from an Oracle 19c database hosted on a Google Compute Engine virtual machine into Google BigQuery. The tutorial covers all necessary steps, including prerequisites—enabling APIs and configuring firewalls, setting up the Oracle source environment, establishing secure networking, creating connection profiles for both Oracle and BigQuery in Datastream, preparing the Oracle database for Change Data Capture (CDC), and finally, creating and validating the Datastream replication job.

Prerequisites

Before proceeding, ensure the following Google Cloud APIs are enabled within your project. This is done through the Google Cloud Console under APIs & Services

  • Compute Engine API
  • Service Health API
  • Datastream API

Image of Service Health API enable button

Image of Datastream API enable button


Create a Firewall rule:

  • Name: dzone-firewallrule
  • Source IPv4 Ranges: 10.0.101.0/24 (Adjust to match the internal IP range of your Oracle VM's subnet).
  • Specified Ports: 22 (for SSH access), 1521 (Oracle Listener default port), TCP protocol.
  • Click on "CREATE."

Create a firewall rule instruction.

Create a firewall rule instruction, protocol and ports.


Create another firewall rule:

  • Name: dzone-datastream
  • Network: dzone-custom-vpc
  • Direction of traffic: Ingress (Incoming connections)
  • Action on match: Allow
  • Targets: All instances in the network
  • Source Filters: IPV4 Ranges
  • Source IPV4 range: 10.0.103.0/29 (Datastream's private connectivity IP range; this may vary depending on your setup)
  • Protocols and Ports: Allow All
  • Click on "CREATE."

Create a firewall rule instruction.

Create a firewall rule instruction, protocol and ports.


Setting up Oracle Database in Google Compute Engine

In today's data-driven world, database systems are essential. Oracle Database offers a Relational Database Management System (RDBMS) for handling critical business information reliably and securely.

To gain flexibility and scalability, organizations can choose to host these databases in the cloud. Google Compute Engine (GCE), a core part of Google Cloud Platform, provides high-performance Virtual Machines (VMs). This allows businesses to run their Oracle Database workloads on GCE to address their specific needs. 

  • Navigate to Compute Engine and click on "CREATE INSTANCE" to Create the Compute Engine VM where we will install the Oracle Database.

Image showing how to create instance on Google Compute Engine.



  • Specify the VM name as oracle19c, leave the rest of the default settings.

Image that shows how to specify the VM name as Oracle19c on Google Compute Engine.



  • Select OS and storage, click on "CHANGE."

Image showing how to select OS and storage on Google Compute Engine.



  • For demonstration purposes we are using Red Hat Enterprise Linux 8 as the operating system, please update settings according to your environment settings. Click on "SELECT."

Image showing how select OS on Google Compute Engine.



  • Select the VPC or create a custom VPC and Click on "CREATE."

Image that shows how to select/create VPC on Google Compute Engine



  • Once the VM is created successfully, Click on "SSH" to log in to the VM.

Image that shows how to click on SSH to login to VM on Google Compute Engine.



  • Click on "authorize" and "verify" if you are able to log in to the Compute Engine Instance VM.
  • Run the command, sudo yum install wget, in case wget is not installed in your VM.
  • Go to Oracle and download Oracle Database 19c (19.3) for Linux x86-64 (RPM).
  •  Cancel the download. Next, copy the download URL and run the command wget in your VM.

Image showing how to run command in VM.


  • Run the command curl -o oracle-database-preinstall-19c-1.0-2.el8.x86_64.rpm https://yum.oracle.com/repo/OracleLinux/OL8/appstream/x86_64/getPackage/oracle-database-preinstall-19c-1.0-2.el8.x86_64.rpm.

Image showing how to run command in VM.


  • Run the command, sudo yum -y localinstall oracle-database-preinstall-19c-1.0-2.el8.x86_64.rpm.

Image showing how to run command in VM.

  • Run the command rm oracle-database-preinstall-19c-1.0-2.el8.x86_64.rpm.
  • Run the command sudo yum -y localinstall oracle-database-ee-19c-1.0-1.x86_64.rpm.

Image showing how to run command in VM.


  • Notice the message confirming the successful installation of the Oracle Database.

Image of successful installation of Oracle Database.


  • Run the command sudo /etc/init.d/oracledb_ORCLCDB-19c configure.

Image showing command prompt.


  • Run the command, sudo su - oracle to log in to the Oracle user account. 

vi .bash_profile, export ORACLE_HOME=/opt/oracle/product/19c/dbhome_1 and export PATH=$PATH:$ORACLE_HOME/bin

Image showing command prompt.


  • Capture the VM's private IP and copy the hostname from $ORACLE_HOME/network/admin/tnsnames.ora and update /etc/hosts file.

Image showing command prompt.


  • Update the tnsnames.ora; update the HOST internal IP address to the internal IP address of the compute engine VM.

MY_PDB_CONNECTION_ALIAS =

  (DESCRIPTION =

    (ADDRESS = (PROTOCOL = TCP)(HOST = 10.0.101.26)(PORT = 1521)) 

    (CONNECT_DATA =

      (SERVER = DEDICATED)

      (SERVICE_NAME =ORCLPDB1)

    )

 )


Image showing command prompt.


  • Run the command, sqlplus / as sysdba.

Image showing command prompt.


  • Run the command, SELECT name FROM v$pdbs; and capture the PDB Name and run the command ALTER SESSION SET CONTAINER=ORCLPDB1;

Image showing command prompt.


  • Run the command, @/opt/oracle/product/19c/dbhome_1/demo/schema/human_resources/hr_main.sql, specify the password you want to use, tablespace as users, temporary tablespace as temp, log directory as $ORACLE_HOME/demo/schema/log/.

Image showing command prompt.


  • If there are no errors and commit is complete, you should see the success message below.

Image showing PL/SQL procedure successfully completed.


Setting up Google Datastream

Google Datastream is a fully managed, serverless Change Data Capture (CDC) and replication service that enables you to synchronize data across heterogeneous data sources reliably and with minimal latency. It simplifies building data pipelines from databases like MySQL, Oracle, and PostgreSQL into Google Cloud services like BigQuery and Cloud Storage, empowering real-time analytics and data warehousing.

  • Navigate to Datastream and click on "Private connectivity."

Image showing how to navigate Google Datastream


  • Click on "CREATE CONFIGURATION."

Image showing how to navigate Google Datastream.


  • Specify the Configuration name and Configuration ID as dzone-oracle-configuration, Region as us-central1, VPC dzone-custom-vpc and IP address range as 10.0.103.0/29 then click on "CREATE."

Image showing how to navigate Google Datastream.


  • If you notice an error - "constraints/compute.restrictVpcPeering."

Image showing how to navigate Google Datastream.


  • Navigate to Organization policies, search for constraints/compute.restrictVpcPeering and click on "Restrict VPC peering usage."

Image showing how to navigate Google Datastream.


  • Click on "Manage Policy."

Image showing how to navigate Google Datastream.


  • Click on "Override parent's policy," Policy enforcement as "Replace," Policy values to "Allow All," Click on "Done" and click on "Set policy."

Image showing how to navigate Google Datastream.


  • Navigate back to Datastream and delete the old configuration and click on "CREATE CONFIGURATION."

Image showing how to navigate Google Datastream.


  • Specify the Configuration name and Configuration ID as dzone-oracle-configuration, Region as us-central1, VPC dzone-custom-vpc and IP address range as 10.0.103.0/29 and click on "CREATE."

Image showing how to navigate Google Datastream.


  • Navigate to Connection profiles and click on "CREATE PROFILE."

Image showing how to navigate Google Datastream.


  • Click on "BigQuery."

Image showing how to navigate Google Datastream.


  • Specify the Connection profile name and Connection profile ID as dzone-bigquery-profile, Region as us-central1 and click on "CREATE."

Image showing how to navigate Google Datastream.


  • Click on "CREATE PROFILE."

Image showing how to navigate Google Datastream.


  • Click on "Oracle."

Image showing how to navigate Google Datastream.


  • Specify the Connection profile ID as dzone-oracle-profile, Region as us-central1, Hostname as 10.0.101.26, Port as 1521, username as HR, Password as Oracle123, System identifier as ORCL. 

Image showing how to navigate Google Datastream.


  • Click on "CONTINUE."

Image showing how to navigate Google Datastream.

Click on "Connectivity method" then set Private connectivity configuration as dzone-oracle-configuration and click on "CONTINUE."

Image showing how to navigate Google Datastream.


  • Navigate to the Oracle Database running in the Compute Engine VM. Run the commands:

SHUTDOWN IMMEDIATE;

STARTUP MOUNT;

ALTER DATABASE ARCHIVELOG;

ALTER DATABASE OPEN;


Image of Oracle Database running in the Compute Engine VM.


  • Exit out of the Oracle Database, then run the commands:

rman target /

CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 4 DAYS;

exit

Connect back to the Oracle database with the command:

sqlplus / as sysdba


Image of Oracle Database running in the Compute Engine VM.


  • Run the commands:

ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;

ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;


Image of Oracle Database running in the Compute Engine VM.


  • Run the commands:

CREATE USER C##dstream IDENTIFIED BY welcome1 CONTAINER=ALL;

GRANT CREATE SESSION TO C##dstream;

GRANT SET CONTAINER TO C##dstream;

GRANT SELECT ON SYS.V_$DATABASE TO C##dstream;

GRANT SELECT ON SYS.V_$LOGMNR_CONTENTS TO C##dstream;

GRANT EXECUTE ON DBMS_LOGMNR TO C##dstream;

GRANT EXECUTE ON DBMS_LOGMNR_D TO C##dstream;

GRANT LOGMINING TO C##dstream;

GRANT EXECUTE_CATALOG_ROLE TO C##dstream;


Image of Oracle Database running in the Compute Engine VM.


  • Run the commands:

ALTER PLUGGABLE DATABASE ORCLPDB1 OPEN;

ALTER SESSION SET CONTAINER=ORCLPDB1;

GRANT CREATE SESSION TO C##dstream;

GRANT SET CONTAINER TO C##dstream;

GRANT SELECT ANY TABLE TO C##dstream;

GRANT SELECT ON SYS.V_$DATABASE TO C##dstream;

GRANT SELECT ON SYS.V_$ARCHIVED_LOG TO C##dstream;

GRANT SELECT ON DBA_SUPPLEMENTAL_LOGGING TO C##dstream;

GRANT SELECT ON DBA_EXTENTS TO C##dstream;


Image of Oracle Database running in the Compute Engine VM.


  • Navigate to Connection Profiles and update the dzone-oraclebq-profile profile with new user and password C##dstream, welcome1. Click on "SAVE."

Image showing how to navigate Google Datastream.


  • Create the Data Stream.

Image showing how to navigate Google Datastream.


  • Click on "CONTINUE."

Image showing how to navigate Google Datastream.


  • Select the Source connection profile as dzone-oraclebq-profile and click on "RUN TEST," once you get the message as Test passed click on "CONTINUE."

Image showing how to navigate Google Datastream.


  • Select objects to include as Specific schemas and tables, select the HR schema.

Image showing how to navigate Google Datastream.



  • Select Logminer in the "Specify the CDC method." Click on "CONTINUE."

Image showing how to navigate Google Datastream.


  • Select the dzone-bigquery-profile and click on "CONTINUE."
  • Select the Schema Grouping as Dataset for each schema and Stream mode as Append-only. 
  • Click on "RUN VALIDATION"; once all the validations are successful, click on "CREATE & START."

Image showing how to navigate Google Datastream.


  • Notice that the HR schema and the tables are created successfully in BigQuery.

Image showing how to navigate Google Datastream.


Summary

We successfully configured a data replication pipeline using Google Cloud Datastream. We created specific firewall rules within the custom VPC to allow traffic for SSH (port 22), Oracle (port 1521), and Datastream's private connectivity IP range. We provisioned a Compute Engine VM, installed Oracle Database 19c Enterprise Edition, configured the database instance and network settings and populated it with the sample HR schema. We established a Datastream private connectivity configuration linked to the custom VPC.

We created Datastream connection profiles for the Oracle source and for the BigQuery destination. We configured the Oracle database for CDC by enabling ARCHIVELOG mode, setting supplemental logging, and creating a dedicated Datastream user with the necessary Logminer and object access privileges. We successfully created and tested a Datastream, selecting the Oracle and BigQuery connection profiles, targeting the HR schema, and utilizing the Logminer method for CDC.

Oracle Database Virtual private cloud Data (computing)

Opinions expressed by DZone contributors are their own.

Related

  • Fixing Common Oracle Database Problems
  • OpenShift Data Foundation on IBM Cloud Using Terraform, Part 1: Deployment
  • Develop With OCI Real-Time Speech Transcription and Oracle Database NL2SQL/Select AI To Speak With Your Data
  • Streamlining Data Integration

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!