Getting Started with Ray on Google Cloud Platform
As AI and machine learning workloads continue to grow in scale and complexity, the need for flexible and efficient distributed computing frameworks becomes increasingly important. Ray is an open-source framework built to simplify the development and execution of distributed applications using familiar Python syntax. This post introduces how to get started with Ray on Google Cloud Platform, covering the fundamentals of Ray’s distributed architecture, core components, and scaling strategies. You’ll learn how to deploy and manage Ray clusters on Vertex AI, configure autoscaling, and run distributed Python and machine learning workloads with practical code examples.
Monday, March 31, 2025
20 min read
Building Trustworthy RAG Systems with In Text Citations
Retrieval-Augmented Generation (RAG) has revolutionized how we build question-answering and content creation systems. By combining the power of large language models (LLMs) with external knowledge retrieval, RAG systems can generate more accurate, informative, and up-to-date responses. However, a critical aspect often overlooked is trustworthiness. This is where citations come in. Without citations, a RAG system is a "black box,".This post will explain the importance of citations in RAG systems and provide some implementations using Google's Generative AI SDK, LangChain, and LlamaIndex, with detailed code walkthroughs.
Monday, March 24, 2025
17 min read
FullStack AI Series - Intro to System Design for Data Scientists and ML Engineers
System design is the process of laying out a system's structure, components, modules, interfaces, and data to meet specified requirements. For machine learning engineers and data scientists, comprehending a system's life cycle provides a blueprint for building, deploying, and maintaining ML/AI solutions in production. This post will introduce and discuss some of the more critical stages of the system design process (including requirements analysis, architecture, development, deployment, and scaling). It will also introduce some technologies and tools that can be used to design, develop, and deploy systems, such as Docker, Docker Compose, Docker Swarm, and Kubernetes.
Sunday, March 17, 2024
35 min read
Python for Data Science Series - Exploring the syntax
In the last post, we discussed the importance of programming in the data science context and why Python is considered one of the top languages used by data scientists. In this week's post, we will explore the syntax of Python and create a simple program that uses Google Cloud Vision API to detect faces in an image.
Tuesday, October 25, 2022
20 min read
Python for Data Science Series - Getting started
Thinking about jumping into a data science role, but you don't know why you should learn how to program and which programming language to choose? In this post, I will show you how to use python and discuss why this programming language is considered one of the top used in data science.
Friday, August 26, 2022
7 min read