Build Data Intense Applications using Google Cloud Platform

1 article/video left!

log in or sign up to unlock 3 more articles/videos this month and explore our expert resources.

Pooja Kelgaonkar
Senior Data Architect
Automatic Summary

Building Data-Intensive Applications with Google Cloud Platform

Hello and welcome to our informational session discussing how to build data-intensive applications using Google Cloud Platform (GCP). Today, we'll be exploring the pillars of application design, the available GCP services, as well as several patterns that could be applied using GCP.

About the Speaker

I'm Puja Kakkar, a data professional with 16+ years of experience in the field. Currently, I'm working as a senior architect at Tracks Space and have recently been recognized as a Snowflake Data Superhero, being one of the 72 members worldwide.

Google Cloud: An Overview

Google Cloud is Google's cloud service offering, providing opportunities in advanced analytics, future development, and fostering an innovation mindset. It offers several advantages when it comes to cloud adoption, such as reliability, scalability, and manageability.

Data-Intensive Application Pillars

The foundation of data system involves three components:

  1. Reliable: The reliability of your application.
  2. Scalable: The scalability of your application.
  3. Maintainable: The level of maintenance, whether it is low or high cost.

The second pillar is Data Models and Query Languages, important in determining how data is stored, retrieved, and queried. This could involve both transactional and analytical data. The third pillar, Distributed Data, considers data replication, partitioning, and the type of transactions run on top of your data platform. Lastly, we have Derived data, which considers how data is processed and the nature of the data in your platform.

How to Approach Application Design with GCP?

GCP supports different approaches to application design; These include Extract-Transform-Load (ETL), Extract-Load-Transform (ELT), and a hybrid approach using both. GCP is rich in data services supporting relational databases, transactional operations, NoSQL, and analytical operations. GCP services can be categorized into various compute offerings, storage offerings, big data, and machine learning/ AutoML.

Implementing Patterns with GCP

Using GCP, you can implement different patterns, such as the ETL pattern through Data Flow and Data Fusion, and the ELT pattern through BigQuery and Data Proc. You also have multiple options for analytical patterns, including predictive analytics, descriptive analytics, automated model training cycle, and AutoML services.

Designing a Data Platform with GCP

You can implement different platform designs using GCP services. These include:

  • Data Warehouse: This involves designing a data warehouse using GCP's BigQuery, which can be integrated with machine learning services such as AutoML and Vertex AI.
  • Data Lake: In contrast to a data warehouse, a data lake involves storing and maintaining your data at the cloud storage level. Cloud Storage forms the core of the data lake in this design.
  • Data Mesh: This involves decentralizing data at a domain level. Each domain would have its own processing and transformation logic.

Data Governance with GCP

Data Governance is crucial to any application design. GCP offers a service called Data Catalog for implementing a technology and business-level data catalog, while Data Fusion helps in implementing data governance.

Conclusion

Building data-intensive applications using Google Cloud Platform involves understanding and applying application design principles and using the breadth of services available on GCP. Whether you're extracting and transforming data, performing predictive analytics or designing a data platform, you'll find that GCP offers reliable, scalable, and efficient services to meet your needs.


Video Transcription

Read More