NumaFlow
Empowering real-time data processing with simplicity
Sole designer at Intuit DevX Design Team, during the summer internship, leading the redesign of NumaFlow.
Challenge
Intuit Engineering Team struggled to efficiently observe and debug large-scale, real-time data processing pipelines, often spending excessive time sifting through log files to identify issues.
Opportunities
To create a powerful, user-friendly interface that would dramatically improve engineers' workflow and productivity in managing complex data pipelines
Team
Intuit DevX Team,
Design Manager
Natasha Girotra
Timeline
May 2022 - Jul 2022
Responsibilities
UX Research
User Test
Web Data Design
Data Visualization
Client
Original Version
💡
Numaflow 1.0 - Scattered info for clicks
On the right is the original interface where you can see there's a pipeline panel displaying the dependencies of vertexes for data processing. If you click on one vertex, there will be the [vertex info] pumping up. If you click on the [Pods] button you can see this [Pods View] page.
New Design
✅
Numaflow 2.0 - All functions in one page
The final design for the NumaFlow project was implemented with a new filter by time, a clearer display panel of Node relationships, easier access to CPU and Memory occupancy, and a cleaner hierarchy under Vertex information.
Overview
NumaFlow is an open-source tool for real-time data processing and debugging in Kubernetes environments. It handles critical data processing needs including:
Stream
processing
Batch processing
for large datasets
Aimed at developers with basic Kubernetes knowledge, it simplifies identifying data bottlenecks and streamlines log analysis.
The initial design problems
I interviewed 12 Intuit internal users of NumaFlow (including front-end engineers, back-end engineers, data scientists, and product managers to get an overview of the scope of the project, and identified several critical usability issues:
📊
The Pod info is scattered and CPU/Memory occupation metrics are hard to access at first sight
🔄
From the initial interface to the specifics of certain pod, multiple steps of clicking are required
🔍
The anomly logs are hard to spot from the Pod Logs, since all information was displayed without hierarchy
Competitors overview
We went through some similar products that are based on the Kubernetes field including ArgoCD and Purser. We focus on several of the features with the competitors under those criterias:
Representation of condensed information for easier access
Display the hierarchy of complicated K8S elements
Filter and aceess of required information quickly
Display information from side pop up efficiently
Anomaly score indicators visible at first sight
Purser (VMWare)
Clear seletable metrics to filter through
CPU and memory information displayed clearly
Align with PMs
Simplicity
Enhancing the UI while maintaining simplicity for power users.
Due to limited time and engineering bandwidth, the general design pattern should not be changed to align with the 4-week implementation timeline.
Priorities
🚪
Difficult pod view navigation
😫
Complicated log following process
🌐
Improve data pipeline display
📉
Poor hardware metrics readability
Design #1:
Info Architecture
🔀 Pipeline → 🔸 Vertex →
🗂️ Search/Browse Pods → 📝 Search Logs in Pods
This clearer structure is the leading principle for the redesign of NumaFlow
🔀
Pipelines are formed by different vertexes. It can be conceptually understood as the structural abstract of a stream or batch-processing pipeline for data stream.
🔸
Vertex
Under a pipeline, there are many Vertexes. Each vertex symbolizes the nodes in the data streaming process.
🗂️
Pods
Under vertex, there can be one or more Pod. Pod information can be searched to sync or investigate with other K8S platforms.
Problem: 😕 Unclear pipeline-pod relationship
Solution: ✨ Vertex-triggered information display
Version 1️⃣: Separate Button Design
🔘 Pipeline details in separate tab
📊 Pods view as parallel option
❌ Unclear hierarchy between pipeline & pods
🤔 May cause user confusion
👆 Requires extra clicks
Improved Design
Version 2️⃣: Integrated Click Design
🎯 Click vertex → see all info directly
📦 Pods view & specs combined
✅ Clear information hierarchy
🔄 Natural flow of information
🚀 More efficient navigation
Problem: 😕 Cluttered, unscalable pod navigation
Solution: ✨ Color-coded hexagon interface
Version 1️⃣: Button Navigation
📑 Pods listed as tabs
⚠️ Gets cluttered with many pods
🤔 Confusing information structure
📉 Poor scalability for large numbers of Pods
Improved Design
Version 2️⃣: Hexagon Visualization
🔷 Hexagonal pod representation
🎨 Color-coded status (red = urgent)
🔍 Easy visual scanning
📈 Scales well with many pods
🚦 Instant status recognition
Design #4:
Resource Display
Problem: 😕 Hard-to-read nested resource data
Solution: ✨ Clear, interactive number display
Version 1️⃣: Grid blocks in card
🎨 Grid-style colored blocks
👀 Hidden numbers in boxes
🗃️ Nested card complexity
Improved Design
Version 2️⃣: Number display in colors
📈 Prominent colored numbers
🔄 Interactive hexagon system
↔️ Separated CPU/MEM stats
Design #5:
Log Search
Problem: 😕 Time-consuming log search experience
Solution: ✨ Smart filter bar with instant results
Version 1️⃣: Color Coding
🌈 Terminal-style color indicators
👀 Hard to spot specific info
📜 Endless scrolling required
Improved Design
Version 2️⃣: Filter System
🎯 One-click search function
⚡ Quick filtering options
Design #6:
Process Rate Visibility
Problem: 😕 Cumbersome access to processing rates
Solution: ✨ Instant hover-triggered data view
Version 1️⃣: Global Filter
🎚️ Top-level filtering menu
🖱️ Extra clicks needed
📊 Limited data visibility
Improved Design
Version 2️⃣: Hover Display
🖱️ Simple mouse hover
📈 Instant rate display
🕒 All timestamps visible
Product Impact
Metrics
🔥 33% critical outages from changes
🏗️ 2500+ K8s services running
Impact in SaaS matrix
Numaflow works hand-in-hand with ML to catch and fix problems
Allows smart streaming (NumaFlow) + ML detection (NumaLogic) with AIOps platform
Released on KubeCon 2022
Intuit shared on how we built a K8s native DAG-based streaming processing platform (Numaflow) and streaming ML platform (Numalogic) to solve outages using AIOps
Reflections
Cross-field collaboration
Think Bold, Act quickly, and Connect more: by initiating new sessions and learning together
Multi-tasking is important: handle several projects with a priority
Fast learning and iterating
Wearing the Engineer’s hat: embrace the unknown and constantly research and ask
Constant design-feedback loop and communication
Product is the key, always user first