First-principles systems builder

TIAGO OLIVEIRA

30 years of systems thinking // from diesel engines to distributed systems

Bridging domains others avoid: mechanical to software, operations to computer vision, telephony to AI. Same systems thinking, different tools!

Scroll to explore

The Journey

From rebuilding diesel engines in rural Brazil to architecting AI platforms serving tens of millions—this is a story of systems thinking.


The Shop Floor 1994–2013

I started working at age 8 in my father's truck mechanic shop in Xanxerê, a small city in southern Brazil. For 19 years, I worked on diesel engines, hydraulic systems, and pneumatic equipment.

This wasn't hobby work. It was full-time mechanical engineering in an environment where diagnostic manuals didn't exist and parts had to be fabricated rather than ordered.

Diagnosing why a diesel engine fails under load requires systematic elimination of variables—fuel delivery, compression, timing, electrical systems. This same mental model later applied to debugging distributed systems at scale.

As automation became more common in heavy machinery, I began working with PLCs, PIC microcontrollers, and early Arduino boards. I built digital controllers interfacing with mechanical, hydraulic, and pneumatic systems for wood processing equipment, tube bending machines, and food manufacturing.

The critical insight: Edge computing and IoT came naturally later because I was already solving hardware-software integration at industrial scale. The jump to cloud infrastructure wasn't abandoning mechanical work—it was taking the same systems thinking from factory floors to distributed platforms.


The Forced Upgrade 2004–2007

My mother recognized that physical labor, while honorable, limited long-term opportunity. She insisted I enroll in a three-year software development bootcamp while continuing to work in the shop.

This wasn't a gentle suggestion. It was a forced update—Mom.exe pushing a mandatory patch.

The bootcamp taught Java, SQL, and web development basics. The value wasn't the specific technologies—it was learning that software systems could be decomposed and debugged using the same systematic thinking I'd developed with diesel engines.

I earned a Bachelor of Technology in Information Technology and a Post-Graduate Specialization in Software Development with Java from Universidade do Oeste de Santa Catarina.


The Bridge Years 2013–2017

My first professional software role at NewFocus involved customizing ERP systems for industrial clients. This was the perfect bridge between my mechanical background and software development—solving the "last mile" problem of connecting physical operations to digital systems.

A wood processing company needed to track trucks leaving their premises and capture load weights automatically. I built custom ERP modules integrating PDA devices, industrial weighing scales, and real-time data pipelines. I wasn't just writing database queries—I was solving how to make factory floor operations visible to business systems.

At Nokia Siemens Networks, I built a personnel safety system for cell tower technicians using cellular triangulation before GPS was ubiquitous in mobile devices. If a technician didn't physically move within a specified time window, the system escalated automatically. This was real distributed systems work—sensor data, time-series analysis, event-driven alerting.

At Dell Technologies, I pioneered server-side JavaScript rendering in 2013—years before React SSR became standard practice.

At Zenvia Mobile, I dockerized applications when Docker was pre-1.0. The goal was handling unpredictable SMS traffic spikes—political campaigns, breaking news, marketing blasts. I built L1/L2/L3 escalation procedures and a "FireFighter" on-duty rotation system that's still operational at Zenvia today, a decade later.

At AGCO, I built military-grade IoT security for agricultural machinery: sub-millisecond authorization using custom nonce calculation on mutual TLS with Erlang and RabbitMQ. Orchestrated autonomous machine-to-machine coordination for harvesters and grain carts using MQTT-based handshakes with sub-meter GPS accuracy.


The International Protocol 2017–2020

In 2017, I moved to Berlin.

At PayU, I consolidated 14 markets' payment reconciliation—normalizing disparate formats from banks, merchants, and acquirers into serverless platform. 60% cost reduction.

At OSRAM, I architected zero-trust IoT security with OAuth 2.0 extensions and HSM-backed cryptography.

Then to Stuttgart at Mercedes-Benz.io. Multiple departments needed different views of vehicle data across the lifecycle. Legacy system required manual view creation and individual integrations per department.

My architectural insight: event sourcing. Unified vehicle state across all lifecycle stages, allowing any department to materialize their own view from the same event stream.

Built platform using AWS Lambda, containers, S3, DynamoDB, EventBridge, ElastiCache. Connected 60+ global systems, 47% cost reduction, deployment velocity from months to days.

The principle: One source of truth, infinite flexible views, no tight coupling.

Each move required rebuilding credibility: learning new languages, adapting to different engineering cultures, navigating compliance requirements. The gap between internal certainty and external validation creates energy but also stress.


The Cloud Native Era 2020–Present

I joined AWS in May 2020 as Senior Solutions Architect in Stuttgart, focused on Germany's premier manufacturing companies: BMW, Bosch, Siemens, Festo; implementing Industry 4.0 initiatives.

Manufacturing systems have strict reliability requirements. A production line stopping costs hundreds of thousands per hour. My background in mechanical systems gave me intuitive understanding of physical constraints that pure software engineers often miss—vibration, temperature, electromagnetic interference, network reliability in industrial environments.

In October 2021, I moved to Austin as Senior Product Architect at AWS Industry Products, working on computer vision platforms. I invented the CVOps framework: extending MLOps principles to cover the entire computer vision lifecycle. I demonstrated how it works, and led platform development. I focused on the embedded stack, creating a flexible Rust-based system enabling cloud flexibility on resource-constrained devices.

In October 2024, I became Principal Architect focused on telecommunications and generative AI, moving to Seattle.

Real-Time AI Telephony

My current work focuses on real-time AI-powered voice platforms: enabling generative AI interactions over traditional phone systems for major telecommunications carriers.

The technical challenge: Building systems that bridge 1970s telephony protocols (SIP/RTP) with modern AI inference platforms, maintaining carrier-grade reliability and real-time performance.

This is genuinely uncharted territory. When new AI capabilities emerge, I build working prototypes within hours to validate architectural approaches. These prototypes become the foundation for production systems spanning multiple regions, handling sub-100ms latency requirements at massive scale.

The work includes invention disclosures for privacy-preserving AI quality evaluation, and agent-to-agent handover.


The Through-Line

From 1994 to today—truck mechanic's shop in rural Brazil to architecting AI systems at global scale—the through-line is consistent:

Systems thinking applied to ambiguous problems with real-world constraints.

Whether diagnosing why a diesel engine fails under load or why a distributed system fails under load, the mental model is the same:

  1. Decompose to fundamental components
  2. Identify where constraints bite
  3. Build prototype to test assumptions
  4. Iterate based on reality, not theory
  5. Make it operationally excellent before scaling

The tools changed—wrenches to keyboards, diesel engines to distributed systems, mechanical shops to cloud infrastructure, but the approach remained constant!

How I Think

"The gap between demo and production is where I live."

Most architects can design for the happy path. The value is in knowing what breaks at 3am under 10x load with a team that's never seen the code.


First Principles from the Shop Floor

Every distributed system is just another machine with predictable failure modes you can debug and prevent. The more complex the systems get the harder it is to predict, but never impossible!

I spent 19 years diagnosing mechanical failures without manuals. When a diesel engine fails under load, you don't guess. You systematically eliminate variables: fuel delivery, compression, timing, electrical systems. You decompose the problem until you find the constraint.

Software systems work the same way. The abstractions are different, but the physics are the same: latency is limited by speed of light, compute requires energy, energy generates heat, heat requires cooling. Every system has constraints. Find them.


Core Beliefs

Constraint Thinking

Every problem has one constraint that matters most. Find it. Everything else is noise until that constraint is addressed.

In manufacturing, it's usually throughput at a specific station. In distributed systems, it's usually the slowest component in the critical path. In organizations, it's usually the decision that's blocked or the person who's overloaded.

Tradeoff Clarity

Frame decisions so stakeholders can choose. Don't hide complexity. Expose it clearly enough that the right people can make informed tradeoffs.

"We can have consistency or availability, not both during a partition" is useful. "It's complicated" is not.

This applies to technical and organizational decisions equally. When a team is stuck, it's often because the tradeoffs aren't visible to the people who need to make them.

Teaching as Liberation

I love teaching. But not the kind that produces copies of the teacher.

Following Paulo Freire's thinking, I see education as a tool for liberation, not indoctrination. The goal isn't to make people think like me. It's to help them think for themselves. Questioning, dialogue, co-discovery. The best outcome is when someone reaches a conclusion I wouldn't have, and it's better than mine.

Individual contribution doesn't scale. What scales is enabling others to solve problems you'll never see. The best architectural decisions are the ones teams can extend without you. The best debugging sessions are the ones where someone else finds the root cause because they learned how to look.

Design for Evolution

Today's architecture is tomorrow's legacy. Build systems that can be replaced piece by piece, not rewritten wholesale.

Event sourcing at Mercedes-Benz wasn't just about current requirements. It was about enabling views we couldn't predict yet. One source of truth, infinite flexible views.

Two-Way Door Decisions

I'm a fervent advocate for reversibility.

One-way doors are decisions that are costly or impossible to undo. They deserve caution, analysis, and buy-in. Two-way doors are decisions you can reverse if wrong. They deserve speed and experimentation.

Most decisions are two-way doors mistaken for one-way doors. Teams slow down unnecessarily, seeking consensus for choices that could simply be tried and reverted. Recognizing which type you're facing changes everything about how you should move.


The Mechanical Foundation

I maintain a full workshop in my garage for building metal pieces. This isn't nostalgia. It's philosophy.

Pure abstraction without physical reality feels incomplete. The best software systems account for real-world constraints that pure software engineers often miss:

  • Vibration affects sensors and connections
  • Temperature changes component behavior
  • Electromagnetic interference corrupts signals
  • Network reliability varies by environment
  • Power availability isn't guaranteed

When you've rebuilt an engine in a shop where the nearest replacement part is 500km away, you develop a different relationship with operational excellence. You build systems that can be diagnosed and repaired, not just deployed and replaced.

One lesson that stuck: never assemble without proof you're going in the right direction. I've mounted an engine back into the chassis only to discover I needed to pull it again for one oil retainer I missed. That teaches you something about validation. You learn to verify before you commit. Check the next layer before closing up the current one. In software, this translates directly: don't merge without confidence, don't deploy without verification, don't architect yourself into a corner you can't back out of.


How I Work

Hands-On Leadership

I don't design systems in ivory towers. For my current telephony work, I didn't just draw architecture diagrams. I built WebSocket servers, tuned GStreamer pipelines, debugged SIP flows, and solved jitter buffer timing issues.

This keeps architectural decisions grounded in implementation reality. It's also how you earn credibility with engineering teams. People trust your judgment differently when they've seen you debug alongside them.

Prototype-First Validation

Plans are hypotheses. Prototypes are evidence.

When new AI capabilities emerge, I don't write documents about what we could do. I build working prototypes, sometimes within 24 hours. Those prototypes accelerate enterprise decisions and de-risk architectural choices.

The pattern:

  1. Identify the riskiest assumption
  2. Build the smallest thing that tests it
  3. Learn from reality, not theory
  4. Scale what works

Navigating Organizations

Technical problems are often organizational problems in disguise. A system that requires three teams to coordinate for every deployment isn't a technical architecture problem. It's a team boundary problem.

I've learned to read organizational dynamics the same way I read system architecture: where are the bottlenecks, who holds context that others need, what decisions are blocked and why. Sometimes the right technical choice is the one that works with organizational reality rather than against it.

Operational Excellence as Default

I come from environments where system failure had immediate economic consequences. Factory lines stopping. Trucks broken down. I build observability and failure recovery from day one.

Not as an afterthought. Not as a "phase 2." From day one.


What I'm Skeptical Of

Process theater. Meetings about meetings. Documentation that no one reads. Ceremonies that don't produce decisions.

Premature abstraction. Three similar lines of code are better than a premature abstraction. Build for today's requirements, not tomorrow's hypotheticals.

Architecture astronauts. People who design systems they'll never implement or operate. The gap between diagram and deployment is where most architectures fail.

"Best practices" without context. What works for Google doesn't work for a 5-person startup. What works in a microservices architecture doesn't work for a monolith. Context determines correctness.


What I Optimize For

Clarity over cleverness. Readable code over clever code. Explicit over implicit. Boring technology over exciting technology.

Operational simplicity. Can someone debug this at 3am? Can a new team member understand it in a week? Can it fail gracefully?

Speed on two-way doors. If a decision can be reversed, make it fast. Save the deliberation for the ones that can't.

Learning velocity. How fast can we discover what we don't know? Prototypes beat documents. Production beats staging. Customer feedback beats internal review.

Team capability. Am I leaving this team better equipped than I found them? Can they operate and evolve this system without me?


The Through-Line

Whether debugging diesel engines or distributed systems, the approach is the same:

  1. Decompose to fundamental components
  2. Identify where constraints bite
  3. Build prototype to test assumptions
  4. Iterate based on reality, not theory
  5. Operationalize before scaling

The tools changed. The thinking didn't.

Tiago Oliveira

Principal Engineer & Architect | AI Platforms at Carrier Scale

tiago@tiago.sh · LinkedIn · tiago.sh · Greater Seattle Area


What I Do

I build the systems that turn ambitious technical bets into production platforms. The pattern is consistent: walk into an ambiguous, high-stakes problem, prototype something that changes the conversation, then architect it for scale and operational excellence.

My current work is real-time AI embedded directly into telecommunications networks, serving tens of millions of subscribers with sub-second latency requirements. The platforms I build don't just solve today's problem; they're designed so product teams can ship new AI capabilities independently without platform rewrites.

I spent 19 years rebuilding truck transmissions before writing my first line of code. That's not a fun fact, it's the foundation. Every distributed system is just another machine with predictable failure modes. Whether diagnosing why a diesel engine fails under load or why a distributed system fails under load, the mental model is the same.

I learn new technologies fast and ship production systems in weeks. I've built at carrier scale across cloud providers, from serverless to Kubernetes to bare metal. The technology stack is the easy part. The hard part is knowing which problem actually needs solving.


Experience

Principal Solutions Architect | Generative AI Platforms

Amazon Web Services (AWS) · October 2024 – Present · Seattle, WA

Architecting real-time AI platforms at carrier scale for major telecommunications operators

→ Built an overnight prototype that displaced an incumbent multi-year engagement. What started as a proof of concept became the foundation for the world's first network-integrated real-time AI platform, announced at the carrier's investor event and now in beta serving millions of subscribers.

→ Drove the strategic pivot from point solution to platform. Fought for and won the architectural framing that positions real-time AI as a network capability, not a standalone feature. This platform approach directly enabled expansion into AI agents, virtual assistants, and dozens of additional agentic workloads. Single-handedly closed a multi-million dollar professional services contract to build the next wave, establishing the foundation for an $800M+ strategic partnership projected to become the largest customer for multiple flagship AWS AI services.

→ Fully embedded in the customer's engineering organization, not treated as a vendor. Perform code reviews, push code, and mentor engineers across all levels. Customer senior leadership made my personal involvement a contractual condition, with directors and senior directors putting their own career reputations behind the project. Hands-on across the full stack: designed dual-channel inference architecture (independent streams per speaker) that outperformed all prior approaches, and pioneered voice ducking techniques that transformed translated calls from robotic turn-taking into natural conversation.

→ Designed comprehensive tenant isolation architecture for multi-tenant AI agent platforms using full-silo isolation on serverless managed services. Published internal research demonstrating why shared infrastructure for AI agents creates structural vulnerabilities (cross-tenant retrieval attacks, memory contamination, credential leakage) that application-level isolation cannot address. Built utterance-level latency observability exported to OpenTelemetry for p50/p95/p99 analysis and a privacy-preserving evaluation framework for translation quality (Go, patent filing in progress).

→ All of this with zero prior telecommunications experience. Learned SIP/RTP, network hairpinning, carrier-grade reliability patterns, and regulatory requirements from first principles, then became the person carrier engineers seek out for guidance.

Senior Product Architect | Computer Vision & AI

Amazon Web Services (AWS) · October 2021 – October 2024 · Austin, TX

Built AI platform reducing investigation time by 35% for major security companies

→ Built video intelligence platform reducing security investigation time 35%. Architected edge-to-cloud continuum (MEC/Outposts for near-edge, TFLite for embedded edge). 40% reduction in false-positive dispatches through intelligent multi-stream correlation and GenAI-powered summarization.

→ Pioneered CVOps framework (Computer Vision Operations) that became the team's foundation. Personally coded initial Python/Rust implementation to prove the pattern, then established CI/CD and shared libraries enabling the team to scale independently. System handled millions of cameras with sub-millisecond inference. 12+ patents filed covering model monitoring, cryptographic video signing, multi-modal scene understanding, and device optimization.

→ Achieved sub-100ms latency at 5,000+ predictions/second. Lambda control plane with auto-scaling Fargate data plane. Architecture handled seamless scaling from POC to production video processing workloads.

Senior Solutions Architect | Industry 4.0 & Edge AI

Amazon Web Services (AWS) · May 2020 – October 2021 · Stuttgart, Germany

Transformed manufacturing with edge-to-cloud AI achieving 99.99% reliability

→ Solved the "cloud latency kills production lines" problem. Manufacturing control systems needed millisecond response times. Architected edge computing framework keeping critical decisions local while maintaining cloud connectivity for analytics. 99.99% reliability, 35% reduction in unplanned downtime.

→ Built IoT fleet management for 10,000+ heterogeneous industrial endpoints. The challenge was enabling modern AI/ML on decades-old legacy machinery that couldn't be replaced. Created abstraction layer bridging legacy equipment with cloud intelligence.

→ Reduced robot safety incidents through simulation, not sensors. Built ROS-based system learning from human patterns and generating collision-free routes automatically. 33% reduction in space invasion, 21% fewer safety stops.

Principal Software Engineer | Platform Architecture

Mercedes-Benz.io · September 2018 – May 2020 · Stuttgart, Germany

Built the technical foundation for Mercedes-Benz's global digital transformation

→ Solved the "every department needs a different view of the vehicle" problem through event sourcing. Architected event-sourcing platform creating unified vehicle state across all lifecycle stages. 60+ systems integrated, 47% cost reduction by eliminating manual integration. Deployment velocity from months to days.

→ Pioneered Hypothesis-Driven Development for ML. Built production-grade MVPs to validate before scaling full pipelines. High-performance pricing engine processing 1,000+ evaluations/second on decades of sales data proved the pattern works under real load.

Senior Staff Software Engineer | FinTech & IoT Security

PayU · OSRAM · May 2017 – September 2018 · Berlin, Germany

→ Consolidated 14 markets' payment reconciliation into single serverless platform. Normalized disparate formats from banks, merchants, and acquirers across markets into a common model. 60% cost reduction from consolidation and elastic scaling. Built ML-based fraud detection with autonomous weekly retraining handling millions of daily transactions.

→ Architected zero-trust IoT security for smart lighting infrastructure. Built custom OAuth 2.0/OpenID Connect extensions for fine-grained device permissions with HSM-backed cryptography.

Senior Software Architect | AgTech IoT

e-Core · February 2016 – May 2017 · Porto Alegre, Brazil

Military-grade IoT platform for agricultural machinery

→ Built near-realtime auth platform without sacrificing security. Implemented custom nonce calculation on mutual TLS using Erlang and RabbitMQ, contributed plugin back to RabbitMQ project. Built air-gapped firmware signing process using Yubikey hardware API for supply chain integrity.

→ Orchestrated autonomous machine-to-machine coordination. Harvesters and grain carts communicating directly for autonomous operation (approach, align, transfer, signal, separate). Built MQTT-based handshake mechanism with sub-meter GPS accuracy.

Early Career Foundation

Incrosolda Serviços e Mecânica · NewFocus · Nokia Siemens · Zenvia Mobile · Dell Technologies · 1994–2016

→ 19 years as truck mechanic starting at age 8, maintaining diesel engines, hydraulics, pneumatic systems, and fabricating parts from scratch when none existed. Built digital controllers interfacing with mechanical, hydraulic, and pneumatic systems for industrial automation.

→ Pioneered early containerization (Docker pre-1.0, 2014) for handling unpredictable traffic spikes serving millions of users. Built server-side JavaScript rendering years before React SSR existed. Created incident management frameworks still in operational use today.

→ Built ERP integrations connecting physical operations to digital systems: PDAs tracking truck departures, weighing systems feeding load data, factory floor operations flowing into business systems.


Education & Recognition

Bachelor of Technology, Information Technology — Universidade do Oeste de Santa Catarina

Post-Graduate Specialization, Software Development with Java — Universidade do Oeste de Santa Catarina

12+ Patents Filed — Computer vision model monitoring, cryptographic video signing, multi-modal scene understanding, device optimization, smart storage

AWS AllStar Award — Customer Obsession

Languages: English (Native-level), Portuguese (Native), Spanish (Professional), German (Conversational), Italian (Basic)

Led hundreds of technical advisors. Promoted 25+ engineers across multiple seniority levels through hands-on mentorship.