Author profile

Henry Bui

Engineering, CloudThinker

Henry Bui leads the engineering team at CloudThinker, building and continuously improving the agentic system end-to-end.

He writes about agent memory, runtime performance, generative UI, and incident response — the parts of the platform that make AI agents fast and useful in production.

Posts by Henry (8)

Product · 2026-05-20
Introducing Deep Response Engine
Most platforms tell you something is wrong. CloudThinker tells you why — and starts fixing it before you open your laptop. Pulse clusters the noise, Incident investigates in parallel, Memory makes the next one faster.
How To · 2026-04-15
Eager Tool Calling: How We Cut Agent Latency by 50% on Long Tool Chains
An 8-tool agent task took 24 seconds. The model was fast. The tools were fast. The wall clock was slow. We rewrote the stream handler to fire each tool the moment its block finishes streaming — not at message_stop — and cut median end-to-end agent latency by 50% across production traffic, with longer tool chains pulling further ahead.
Product · 2026-04-06
Generative UI in Production: Lessons from a Pydantic-to-DSL Migration
A year ago we shipped agent dashboards built on strict Pydantic schemas — typed JSON tool calls, design-system-consistent, secure. They took 30-40 seconds and cost roughly $0.50 per report. Today they stream in under 10 seconds for $0.08, on a line-oriented DSL we built on top of OpenUI Lang. The story of two architectures, two detours we deliberately skipped, and what constrained-decoding JSON taught us about the limits of structured output.
How To · 2026-03-24
Choose the Right Model, Optimize Your AI Costs
Your AI agent just spent 1.7x credits on a simple status check. Meanwhile, a complex root cause analysis failed on the cheapest model. CloudThinker's three-tier system — Light (0.3x), Pro (1.0x), Ultra (1.7x) — lets you match intelligence to complexity. Build Skills with Ultra, run them on Light, and save 40%+ without losing quality.
Product · 2026-03-18
Agent Memory Meets Graph: Introducing MemGraph — Long-Term Memory for AI Cloud Agents
Your AI agent brilliantly diagnosed a connection storm last Tuesday. On Wednesday, the exact same pattern appeared — and the agent started from zero. This is the story of MemGraph: a knowledge graph memory system that lets AI agents remember, connect, and evolve operational knowledge across every conversation.
Product · 2026-02-04
Introducing CloudThinker Incidents
Agentic incident management that thinks like your best engineer. From 45-minute investigations to under 10 minutes with agentic root cause analysis, topology-aware blast radius, and continuous learning.
Product · 2025-11-26
From Weeks to Hours: Building an AI-Powered Cloud Assessment Engine
Learn how we transformed the expensive, weeks-long AWS Well-Architected Review into a 10-minute automated workflow by leveraging specialized AI agents and matrix-based parallelization to deliver actionable insights at cloud scale.
Product · 2025-11-24
CloudThinker Agentic Orchestration and Context Optimization
A technical deep dive into building scalable multi-agent systems using the Supervisor pattern and advanced context optimization.

Posts by Henry (8)

Introducing Deep Response Engine

Eager Tool Calling: How We Cut Agent Latency by 50% on Long Tool Chains

Generative UI in Production: Lessons from a Pydantic-to-DSL Migration

Choose the Right Model, Optimize Your AI Costs

Agent Memory Meets Graph: Introducing MemGraph — Long-Term Memory for AI Cloud Agents

Introducing CloudThinker Incidents

From Weeks to Hours: Building an AI-Powered Cloud Assessment Engine

CloudThinker Agentic Orchestration and Context Optimization