SlideShare a Scribd company logo
1 of 22
Download to read offline
Zebras all the way down
The engineering challenges of the data path
CTO
bryan@joyent.com
Bryan Cantrill
@bcantrill
The luxury of statelessness
• In service-oriented software systems, we love statelessness
• And for good reason: stateless components — like finite state
machines — lend systems many desirable properties!
• Stateless components can be easily made immutable, scalable,
re-deployable, restartable, upgradeable, etc. etc.
• Of course, persistent state still very much exists — we just
use separation of concerns to confine the management of state
to those services that do it explicitly and exclusively…
The data path
• The data path consists of the software, hardware, and firmware
components between a service endpoint that offers persistence
and the implementation of that persistence
• The data path always ends in non-volatile storage, which (for
now, anyway) means either flash or magnetic media
• The data path traverses many subsystems and components —
and nearly always is a distributed system itself
• We place great demands upon the data path…
The demands of the data path
• A data path that merely works much of the time is insufficient
• We (rightfully) expect perfection from the data path: we expect it
to be consistent, available and partition-tolerant!
• Of course, Brewer’s CAP theorem tells us that this isn’t actually
possible — we must make tradeoffs
• Even a well-engineered system can’t beat CAP — but a poorly
engineered one will be flailed by it, becoming pathologically
unavailable or inconsistent
• Zebras are the difference
Zebras?
• In American medical slang, a zebra is a rare and exotic
condition that can be conflated with more common ailments
• Medical students and residents are cautioned against
diagnosing them, to the point of aphorism: “when you hear
hoofbeats, think of horses not zebras”
• But — as anyone who has been afflicted by one will affirm —
zebras emphatically exist!
A zebra close to home
Zebras in the data path?
• Even though the data path runs on and ends with hardware, it
consists of many disjoint and unseen software components
• The paradox of software (especially that of the data path!) is
that software is both information and machine
• When software works correctly, it survives as information does:
namely, in perpetuity
• Especially where software is expensive to write and difficult to
fix, there is an overwhelming bias towards extant software
• Over time, the horses are found; only the zebras are left
Hunting zebra
• We must assume that unusual pathologies — especially in a
distributed system — will not be readily reproducible!
• When we are culturally afflicted with “bias for action”, it
becomes tempting to immediately change the system to fix it
• This is the wrong first motion: the choice to restore service
versus understanding it is often a false dichotomy!
• We must not change the system but rather observe it — we
must focus not on snap hypotheses, but rather initial questions
• The observability of the system is paramount!
Observability at Joyent
• Observability is an organizing principle at Joyent — it is a
primary reason that we run SmartOS, our illumos derivative
• Manta — our (open source, container-centric) object storage
service — has SmartOS and ZFS at its core
• Manta uses sharded PostgreSQL for metadata (+ ZooKeeper
for leader election), with services primarily in node.js
• We invested heavily in the observability and debuggability of
node.js — and it is a (the?) reason we still use node.js
Observability at Joyent Samsung!
• Out of desire to build their own cloud based on Manta and Triton
(our open source cloud management system), Samsung bought
Joyent in June 2016
• While Manta has been in production for several years,
Samsung’s level of scale has brought new-found challenges
• Good news: between several years of production + observability
(logging, DTrace, mdb) + hyperscale post-Samsung, we have
nailed many thorny problems in Manta
• Bad news: our stack — and that of every data path — has
components that we still struggle to observe and debug…
Zebra sanctuary
• Unfortunately, the data path is laced with proprietary software
that can’t be observed, audited, verified, or debugged
• This is the software that interacts so directly with the hardware
as to create the illusion of hardware to higher-level software
• This is firmware, and it runs so dark and deep in the data path
that much of it is impossible to see or catalogue
• Firmware that operates silently will also fail implicitly — it is
hardware failing with software’s failure modes
Zebras in the spindle
• Rotating magnetic media is a modern mechanical marvel
• With sealed enclosures and helium-based drives, densities
continue to increase — the disk will be with us for a long time!
• Disks are vulnerable to vibe, temperature, particulates,
aspersions, wear, etc. — magnetic media will fail!
• But the disk knows this, and sophisticated on-head/on-controller
firmware steers around failed media…
• …leaving much nastier failure modes
Zebras in the spindle
• Disks can (emphatically!) read or write the wrong data
• Seeing this coming reality in the early 2000s, ZFS was designed
around total data path integrity via indirect checksums
• ZFS has discovered all manner of data corruption in storage
systems putatively too expensive to suffer such problems…
• And yet even ZFS oversimplified the failure modes of disks: 15+
years of deploying ZFS, we have seen disks fail in much more
exotic ways than we thought possible
Zebras in the SSD
• Flash wears out so frequently and quickly that much of an SSD
is managing wear and mapping operations to functional flash
• There are entire universes of system software in every SSD!
• SSDs have incredible variety in their operating envelopes —
and can accordingly fail in wildly divergent ways
• This can represent systemic risk in that many SSDs can fail in
the same way at the same time…
• Confession: We’ve been so concerned about a flashtastrophy
that we have always grossly over-engineered our own SSDs
Zebras in the HBA
• The host bus adapter is responsible for brokering I/O from the
operating system to the physical devices
• This is more complicated than it might seem — and in particular,
HBA firmware is infamous for losing I/O under load
• From the perspective of system software this will be an I/O that
never returns — which means it will be timed out and retried
• While the system will maintain liveness, this will induce a
latency outlier — which can manifest itself far up the stack (e.g.,
TCP resets!)
Zebras in the DIMM
• DRAM is a capacitor that must be periodically refreshed
• DRAM is susceptible to fatal failures (e.g., corrosion due to
humidity, temperature or other environmental failures)
• As the speed and density of DRAM have increased (and the
voltage has dropped), DRAM has become more susceptible to
transient bit failure not due to any hardware malfunction
• The “Firmware First” (!) model of error handling in x86 (and the
demise of CMCI) is leading to a silent epidemic of DIMM failure!
Zebras in the chassis
• Even the chassis itself is not immune from software failure
• For example, software and firmware control fan speed — and
failures in that software can result in fans stuck running at their
highest speed
• Fans are not designed to run at full power for extended periods
of time; they wear out or (worse) induce vibration in the chassis
• The effects of (say) vibration will be felt far from the source —
and again, may only manifest latency not explicit failure
Zebras in the NIC
• Failure in the network interface card can be due to NIC firmware
failure or hardware failure (e.g., the optical transceiver)
• Networking failure should be entirely survivable by a distributed
system, but that doesn’t mean it’s without consequence!
• Use of the link aggregation control protocol (LACP) seems
tempting — but can requires more sophisticated software in the
switch (i.e., MLAG)…
• …which itself can lead to new failure modes!
Zebras in the top-of-rack switch
• As their own complicated ecosystem of software and firmware,
top-of-rack switches are prone to software failure
• Failure in the top-of-rack (or worse, the L3 core) can have an
enormous blast radius in a distributed system…
• For example, a switch that drops its ARP tables can result in a
distributed system going massively split brain…
• Or a switch that gets stuck broadcasting traffic can easily DDOS
an entire distributed system — revealing that there is a single-
point-of-failure after all!
Zebras all the way up
• These problems do not manifest themselves cleanly at the point
of origin for reasons both pragmatic and economic
• Hardware vendors don’t want gear shipped back for RCCA!
• Arguably, unreliable components allow (force?) upstack
software to discover its novel failure modes
• But that is an argument for debugging and resolving those
(additional) problems upstack, not for unreliable components!
Don’t fear the zebra
• The data path is not to be undertaken lightly
• Do not assume that testing and monitoring can substitute for
system understanding; enshrine observability
• Reward complete understanding, not merely resolution!
• As long as it’s unobservable, firmware is the enemy — and
trends toward sophisticated firmware are especially troubling!
• Open source software affords us a quality ratchet: we
shouldn’t spend our careers re-solving the same problems!
Further reading and viewing
• For an enlightening (and more positive) take on firmware, check
out the amazing videos of Micah Elizabeth Scott (@scanlime)
• For a snapshot of what we’re currently working on and thinking
about with respect to Manta/Triton, see the Joyent Requests for
Discussion (RFDs) — especially RFD 89 (“Project Tiresias”)
• For more on node.js debuggability, see Dave Pacheco’s talk on
“Industrial-grade node.js”
• Also, thank you to Amanda Lundberg of White Coat Captioning
for the superhuman real-time captioning!

More Related Content

What's hot

Linux Kernel Module - For NLKB
Linux Kernel Module - For NLKBLinux Kernel Module - For NLKB
Linux Kernel Module - For NLKBshimosawa
 
Secure container: Kata container and gVisor
Secure container: Kata container and gVisorSecure container: Kata container and gVisor
Secure container: Kata container and gVisorChing-Hsuan Yen
 
Oracle Performance Tools of the Trade
Oracle Performance Tools of the TradeOracle Performance Tools of the Trade
Oracle Performance Tools of the TradeCarlos Sierra
 
The Coming Firmware Revolution
The Coming Firmware RevolutionThe Coming Firmware Revolution
The Coming Firmware Revolutionbcantrill
 
Real time operating systems (rtos) concepts 3
Real time operating systems (rtos) concepts 3Real time operating systems (rtos) concepts 3
Real time operating systems (rtos) concepts 3Abu Bakr Ramadan
 
[若渴]Study on Side Channel Attacks and Countermeasures
[若渴]Study on Side Channel Attacks and Countermeasures [若渴]Study on Side Channel Attacks and Countermeasures
[若渴]Study on Side Channel Attacks and Countermeasures Aj MaChInE
 
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM SystemsXPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM SystemsThe Linux Foundation
 
I have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systemsI have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systemsbcantrill
 
Operating System-Ch8 memory management
Operating System-Ch8 memory managementOperating System-Ch8 memory management
Operating System-Ch8 memory managementSyaiful Ahdan
 
Virtual machines and their architecture
Virtual machines and their architectureVirtual machines and their architecture
Virtual machines and their architectureMrinmoy Dalal
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageKernel TLV
 
Tanel Poder - Scripts and Tools short
Tanel Poder - Scripts and Tools shortTanel Poder - Scripts and Tools short
Tanel Poder - Scripts and Tools shortTanel Poder
 
Linux Performance Tunning Memory
Linux Performance Tunning MemoryLinux Performance Tunning Memory
Linux Performance Tunning MemoryShay Cohen
 
Introduction To Linux Kernel Modules
Introduction To Linux Kernel ModulesIntroduction To Linux Kernel Modules
Introduction To Linux Kernel Modulesdibyajyotig
 
Analyzing and Interpreting AWR
Analyzing and Interpreting AWRAnalyzing and Interpreting AWR
Analyzing and Interpreting AWRpasalapudi
 
Ten query tuning techniques every SQL Server programmer should know
Ten query tuning techniques every SQL Server programmer should knowTen query tuning techniques every SQL Server programmer should know
Ten query tuning techniques every SQL Server programmer should knowKevin Kline
 
NVMe Over Fabrics Support in Linux
NVMe Over Fabrics Support in LinuxNVMe Over Fabrics Support in Linux
NVMe Over Fabrics Support in LinuxLF Events
 
What is a Kernel? : Introduction And Architecture
What is a Kernel? : Introduction And ArchitectureWhat is a Kernel? : Introduction And Architecture
What is a Kernel? : Introduction And Architecturepec2013
 
Embedded Systems: Lecture 1: Course Overview
Embedded Systems: Lecture 1: Course OverviewEmbedded Systems: Lecture 1: Course Overview
Embedded Systems: Lecture 1: Course OverviewAhmed El-Arabawy
 

What's hot (20)

Linux Kernel Module - For NLKB
Linux Kernel Module - For NLKBLinux Kernel Module - For NLKB
Linux Kernel Module - For NLKB
 
Secure container: Kata container and gVisor
Secure container: Kata container and gVisorSecure container: Kata container and gVisor
Secure container: Kata container and gVisor
 
Oracle Performance Tools of the Trade
Oracle Performance Tools of the TradeOracle Performance Tools of the Trade
Oracle Performance Tools of the Trade
 
The Coming Firmware Revolution
The Coming Firmware RevolutionThe Coming Firmware Revolution
The Coming Firmware Revolution
 
Real time operating systems (rtos) concepts 3
Real time operating systems (rtos) concepts 3Real time operating systems (rtos) concepts 3
Real time operating systems (rtos) concepts 3
 
[若渴]Study on Side Channel Attacks and Countermeasures
[若渴]Study on Side Channel Attacks and Countermeasures [若渴]Study on Side Channel Attacks and Countermeasures
[若渴]Study on Side Channel Attacks and Countermeasures
 
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM SystemsXPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
 
I have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systemsI have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systems
 
Operating System-Ch8 memory management
Operating System-Ch8 memory managementOperating System-Ch8 memory management
Operating System-Ch8 memory management
 
Virtual machines and their architecture
Virtual machines and their architectureVirtual machines and their architecture
Virtual machines and their architecture
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
 
Tanel Poder - Scripts and Tools short
Tanel Poder - Scripts and Tools shortTanel Poder - Scripts and Tools short
Tanel Poder - Scripts and Tools short
 
Monolithic kernel
Monolithic kernelMonolithic kernel
Monolithic kernel
 
Linux Performance Tunning Memory
Linux Performance Tunning MemoryLinux Performance Tunning Memory
Linux Performance Tunning Memory
 
Introduction To Linux Kernel Modules
Introduction To Linux Kernel ModulesIntroduction To Linux Kernel Modules
Introduction To Linux Kernel Modules
 
Analyzing and Interpreting AWR
Analyzing and Interpreting AWRAnalyzing and Interpreting AWR
Analyzing and Interpreting AWR
 
Ten query tuning techniques every SQL Server programmer should know
Ten query tuning techniques every SQL Server programmer should knowTen query tuning techniques every SQL Server programmer should know
Ten query tuning techniques every SQL Server programmer should know
 
NVMe Over Fabrics Support in Linux
NVMe Over Fabrics Support in LinuxNVMe Over Fabrics Support in Linux
NVMe Over Fabrics Support in Linux
 
What is a Kernel? : Introduction And Architecture
What is a Kernel? : Introduction And ArchitectureWhat is a Kernel? : Introduction And Architecture
What is a Kernel? : Introduction And Architecture
 
Embedded Systems: Lecture 1: Course Overview
Embedded Systems: Lecture 1: Course OverviewEmbedded Systems: Lecture 1: Course Overview
Embedded Systems: Lecture 1: Course Overview
 

Similar to Zebras all the way down: The engineering challenges of the data path

Real Time Debugging - What to do when a breakpoint just won't do
Real Time Debugging - What to do when a breakpoint just won't doReal Time Debugging - What to do when a breakpoint just won't do
Real Time Debugging - What to do when a breakpoint just won't doLloydMoore
 
Debugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mindDebugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mindbcantrill
 
The economies of scaling software - Abdel Remani
The economies of scaling software - Abdel RemaniThe economies of scaling software - Abdel Remani
The economies of scaling software - Abdel Remanijaxconf
 
Network Troubleshooting.pptx
Network Troubleshooting.pptxNetwork Troubleshooting.pptx
Network Troubleshooting.pptxMohamedSafeer14
 
FreeBSD: Looking forward to another 10 years by Jordan Hubbard
FreeBSD: Looking forward to another 10 years by Jordan HubbardFreeBSD: Looking forward to another 10 years by Jordan Hubbard
FreeBSD: Looking forward to another 10 years by Jordan Hubbardeurobsdcon
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesDavid Martínez Rego
 
The Economies of Scaling Software
The Economies of Scaling SoftwareThe Economies of Scaling Software
The Economies of Scaling SoftwareAbdelmonaim Remani
 
Non-Functional Requirements
Non-Functional RequirementsNon-Functional Requirements
Non-Functional RequirementsDavid Simons
 
The Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of dataThe Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of databcantrill
 
SQL Server High Availability and DR - Too Many Choices!
SQL Server High Availability and DR - Too Many Choices!SQL Server High Availability and DR - Too Many Choices!
SQL Server High Availability and DR - Too Many Choices!Mike Walsh
 
The Power of Determinism in Database Systems
The Power of Determinism in Database SystemsThe Power of Determinism in Database Systems
The Power of Determinism in Database SystemsDaniel Abadi
 
Instrumenting the real-time web: Node.js in production
Instrumenting the real-time web: Node.js in productionInstrumenting the real-time web: Node.js in production
Instrumenting the real-time web: Node.js in productionbcantrill
 
Visualizing Systems with Statemaps
Visualizing Systems with StatemapsVisualizing Systems with Statemaps
Visualizing Systems with Statemapsbcantrill
 
Building data intensive applications
Building data intensive applicationsBuilding data intensive applications
Building data intensive applicationsAmit Kejriwal
 
Dmk sb2010 web_defense
Dmk sb2010 web_defenseDmk sb2010 web_defense
Dmk sb2010 web_defenseDan Kaminsky
 
Intro to distributed systems
Intro to distributed systemsIntro to distributed systems
Intro to distributed systemsAhmed Soliman
 
DevOps Days Vancouver 2014 Slides
DevOps Days Vancouver 2014 SlidesDevOps Days Vancouver 2014 Slides
DevOps Days Vancouver 2014 SlidesAlex Cruise
 
DataSeers - An HPCC Systems platform for Financial Analysis
DataSeers - An HPCC Systems platform for Financial AnalysisDataSeers - An HPCC Systems platform for Financial Analysis
DataSeers - An HPCC Systems platform for Financial AnalysisHPCC Systems
 
COMP-111 Past Paper 2022 complete Solution PU BS 4 Year Program
COMP-111 Past Paper 2022 complete Solution PU BS 4 Year ProgramCOMP-111 Past Paper 2022 complete Solution PU BS 4 Year Program
COMP-111 Past Paper 2022 complete Solution PU BS 4 Year Programhaiderali8455
 

Similar to Zebras all the way down: The engineering challenges of the data path (20)

Real Time Debugging - What to do when a breakpoint just won't do
Real Time Debugging - What to do when a breakpoint just won't doReal Time Debugging - What to do when a breakpoint just won't do
Real Time Debugging - What to do when a breakpoint just won't do
 
Continuous Platformization
Continuous PlatformizationContinuous Platformization
Continuous Platformization
 
Debugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mindDebugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mind
 
The economies of scaling software - Abdel Remani
The economies of scaling software - Abdel RemaniThe economies of scaling software - Abdel Remani
The economies of scaling software - Abdel Remani
 
Network Troubleshooting.pptx
Network Troubleshooting.pptxNetwork Troubleshooting.pptx
Network Troubleshooting.pptx
 
FreeBSD: Looking forward to another 10 years by Jordan Hubbard
FreeBSD: Looking forward to another 10 years by Jordan HubbardFreeBSD: Looking forward to another 10 years by Jordan Hubbard
FreeBSD: Looking forward to another 10 years by Jordan Hubbard
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming Architectures
 
The Economies of Scaling Software
The Economies of Scaling SoftwareThe Economies of Scaling Software
The Economies of Scaling Software
 
Non-Functional Requirements
Non-Functional RequirementsNon-Functional Requirements
Non-Functional Requirements
 
The Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of dataThe Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of data
 
SQL Server High Availability and DR - Too Many Choices!
SQL Server High Availability and DR - Too Many Choices!SQL Server High Availability and DR - Too Many Choices!
SQL Server High Availability and DR - Too Many Choices!
 
The Power of Determinism in Database Systems
The Power of Determinism in Database SystemsThe Power of Determinism in Database Systems
The Power of Determinism in Database Systems
 
Instrumenting the real-time web: Node.js in production
Instrumenting the real-time web: Node.js in productionInstrumenting the real-time web: Node.js in production
Instrumenting the real-time web: Node.js in production
 
Visualizing Systems with Statemaps
Visualizing Systems with StatemapsVisualizing Systems with Statemaps
Visualizing Systems with Statemaps
 
Building data intensive applications
Building data intensive applicationsBuilding data intensive applications
Building data intensive applications
 
Dmk sb2010 web_defense
Dmk sb2010 web_defenseDmk sb2010 web_defense
Dmk sb2010 web_defense
 
Intro to distributed systems
Intro to distributed systemsIntro to distributed systems
Intro to distributed systems
 
DevOps Days Vancouver 2014 Slides
DevOps Days Vancouver 2014 SlidesDevOps Days Vancouver 2014 Slides
DevOps Days Vancouver 2014 Slides
 
DataSeers - An HPCC Systems platform for Financial Analysis
DataSeers - An HPCC Systems platform for Financial AnalysisDataSeers - An HPCC Systems platform for Financial Analysis
DataSeers - An HPCC Systems platform for Financial Analysis
 
COMP-111 Past Paper 2022 complete Solution PU BS 4 Year Program
COMP-111 Past Paper 2022 complete Solution PU BS 4 Year ProgramCOMP-111 Past Paper 2022 complete Solution PU BS 4 Year Program
COMP-111 Past Paper 2022 complete Solution PU BS 4 Year Program
 

More from bcantrill

Predicting the Present
Predicting the PresentPredicting the Present
Predicting the Presentbcantrill
 
Sharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of ToolmakingSharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of Toolmakingbcantrill
 
Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...bcantrill
 
Hardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden AgeHardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden Agebcantrill
 
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator tracesTockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator tracesbcantrill
 
No Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's LawNo Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's Lawbcantrill
 
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software EngineeringAndreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software Engineeringbcantrill
 
Platform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system softwarePlatform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system softwarebcantrill
 
dtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the uniondtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the unionbcantrill
 
The Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systemsThe Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systemsbcantrill
 
Papers We Love: ARC after dark
Papers We Love: ARC after darkPapers We Love: ARC after dark
Papers We Love: ARC after darkbcantrill
 
Principles of Technology Leadership
Principles of Technology LeadershipPrinciples of Technology Leadership
Principles of Technology Leadershipbcantrill
 
Platform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyondPlatform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyondbcantrill
 
Down Memory Lane: Two Decades with the Slab Allocator
Down Memory Lane: Two Decades with the Slab AllocatorDown Memory Lane: Two Decades with the Slab Allocator
Down Memory Lane: Two Decades with the Slab Allocatorbcantrill
 
The State of Cloud 2016: The whirlwind of creative destruction
The State of Cloud 2016: The whirlwind of creative destructionThe State of Cloud 2016: The whirlwind of creative destruction
The State of Cloud 2016: The whirlwind of creative destructionbcantrill
 
Oral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generationsOral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generationsbcantrill
 
The Container Revolution: Reflections after the first decade
The Container Revolution: Reflections after the first decadeThe Container Revolution: Reflections after the first decade
The Container Revolution: Reflections after the first decadebcantrill
 
Debugging (Docker) containers in production
Debugging (Docker) containers in productionDebugging (Docker) containers in production
Debugging (Docker) containers in productionbcantrill
 
Papers We Love: Jails and Zones
Papers We Love: Jails and ZonesPapers We Love: Jails and Zones
Papers We Love: Jails and Zonesbcantrill
 
Why it’s (past) time to run containers on bare metal
Why it’s (past) time to run containers on bare metalWhy it’s (past) time to run containers on bare metal
Why it’s (past) time to run containers on bare metalbcantrill
 

More from bcantrill (20)

Predicting the Present
Predicting the PresentPredicting the Present
Predicting the Present
 
Sharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of ToolmakingSharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of Toolmaking
 
Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...
 
Hardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden AgeHardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden Age
 
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator tracesTockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
 
No Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's LawNo Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's Law
 
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software EngineeringAndreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
 
Platform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system softwarePlatform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system software
 
dtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the uniondtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the union
 
The Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systemsThe Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systems
 
Papers We Love: ARC after dark
Papers We Love: ARC after darkPapers We Love: ARC after dark
Papers We Love: ARC after dark
 
Principles of Technology Leadership
Principles of Technology LeadershipPrinciples of Technology Leadership
Principles of Technology Leadership
 
Platform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyondPlatform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyond
 
Down Memory Lane: Two Decades with the Slab Allocator
Down Memory Lane: Two Decades with the Slab AllocatorDown Memory Lane: Two Decades with the Slab Allocator
Down Memory Lane: Two Decades with the Slab Allocator
 
The State of Cloud 2016: The whirlwind of creative destruction
The State of Cloud 2016: The whirlwind of creative destructionThe State of Cloud 2016: The whirlwind of creative destruction
The State of Cloud 2016: The whirlwind of creative destruction
 
Oral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generationsOral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generations
 
The Container Revolution: Reflections after the first decade
The Container Revolution: Reflections after the first decadeThe Container Revolution: Reflections after the first decade
The Container Revolution: Reflections after the first decade
 
Debugging (Docker) containers in production
Debugging (Docker) containers in productionDebugging (Docker) containers in production
Debugging (Docker) containers in production
 
Papers We Love: Jails and Zones
Papers We Love: Jails and ZonesPapers We Love: Jails and Zones
Papers We Love: Jails and Zones
 
Why it’s (past) time to run containers on bare metal
Why it’s (past) time to run containers on bare metalWhy it’s (past) time to run containers on bare metal
Why it’s (past) time to run containers on bare metal
 

Recently uploaded

Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 

Recently uploaded (20)

Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 

Zebras all the way down: The engineering challenges of the data path

  • 1. Zebras all the way down The engineering challenges of the data path CTO bryan@joyent.com Bryan Cantrill @bcantrill
  • 2. The luxury of statelessness • In service-oriented software systems, we love statelessness • And for good reason: stateless components — like finite state machines — lend systems many desirable properties! • Stateless components can be easily made immutable, scalable, re-deployable, restartable, upgradeable, etc. etc. • Of course, persistent state still very much exists — we just use separation of concerns to confine the management of state to those services that do it explicitly and exclusively…
  • 3. The data path • The data path consists of the software, hardware, and firmware components between a service endpoint that offers persistence and the implementation of that persistence • The data path always ends in non-volatile storage, which (for now, anyway) means either flash or magnetic media • The data path traverses many subsystems and components — and nearly always is a distributed system itself • We place great demands upon the data path…
  • 4. The demands of the data path • A data path that merely works much of the time is insufficient • We (rightfully) expect perfection from the data path: we expect it to be consistent, available and partition-tolerant! • Of course, Brewer’s CAP theorem tells us that this isn’t actually possible — we must make tradeoffs • Even a well-engineered system can’t beat CAP — but a poorly engineered one will be flailed by it, becoming pathologically unavailable or inconsistent • Zebras are the difference
  • 5. Zebras? • In American medical slang, a zebra is a rare and exotic condition that can be conflated with more common ailments • Medical students and residents are cautioned against diagnosing them, to the point of aphorism: “when you hear hoofbeats, think of horses not zebras” • But — as anyone who has been afflicted by one will affirm — zebras emphatically exist!
  • 6. A zebra close to home
  • 7. Zebras in the data path? • Even though the data path runs on and ends with hardware, it consists of many disjoint and unseen software components • The paradox of software (especially that of the data path!) is that software is both information and machine • When software works correctly, it survives as information does: namely, in perpetuity • Especially where software is expensive to write and difficult to fix, there is an overwhelming bias towards extant software • Over time, the horses are found; only the zebras are left
  • 8. Hunting zebra • We must assume that unusual pathologies — especially in a distributed system — will not be readily reproducible! • When we are culturally afflicted with “bias for action”, it becomes tempting to immediately change the system to fix it • This is the wrong first motion: the choice to restore service versus understanding it is often a false dichotomy! • We must not change the system but rather observe it — we must focus not on snap hypotheses, but rather initial questions • The observability of the system is paramount!
  • 9. Observability at Joyent • Observability is an organizing principle at Joyent — it is a primary reason that we run SmartOS, our illumos derivative • Manta — our (open source, container-centric) object storage service — has SmartOS and ZFS at its core • Manta uses sharded PostgreSQL for metadata (+ ZooKeeper for leader election), with services primarily in node.js • We invested heavily in the observability and debuggability of node.js — and it is a (the?) reason we still use node.js
  • 10. Observability at Joyent Samsung! • Out of desire to build their own cloud based on Manta and Triton (our open source cloud management system), Samsung bought Joyent in June 2016 • While Manta has been in production for several years, Samsung’s level of scale has brought new-found challenges • Good news: between several years of production + observability (logging, DTrace, mdb) + hyperscale post-Samsung, we have nailed many thorny problems in Manta • Bad news: our stack — and that of every data path — has components that we still struggle to observe and debug…
  • 11. Zebra sanctuary • Unfortunately, the data path is laced with proprietary software that can’t be observed, audited, verified, or debugged • This is the software that interacts so directly with the hardware as to create the illusion of hardware to higher-level software • This is firmware, and it runs so dark and deep in the data path that much of it is impossible to see or catalogue • Firmware that operates silently will also fail implicitly — it is hardware failing with software’s failure modes
  • 12. Zebras in the spindle • Rotating magnetic media is a modern mechanical marvel • With sealed enclosures and helium-based drives, densities continue to increase — the disk will be with us for a long time! • Disks are vulnerable to vibe, temperature, particulates, aspersions, wear, etc. — magnetic media will fail! • But the disk knows this, and sophisticated on-head/on-controller firmware steers around failed media… • …leaving much nastier failure modes
  • 13. Zebras in the spindle • Disks can (emphatically!) read or write the wrong data • Seeing this coming reality in the early 2000s, ZFS was designed around total data path integrity via indirect checksums • ZFS has discovered all manner of data corruption in storage systems putatively too expensive to suffer such problems… • And yet even ZFS oversimplified the failure modes of disks: 15+ years of deploying ZFS, we have seen disks fail in much more exotic ways than we thought possible
  • 14. Zebras in the SSD • Flash wears out so frequently and quickly that much of an SSD is managing wear and mapping operations to functional flash • There are entire universes of system software in every SSD! • SSDs have incredible variety in their operating envelopes — and can accordingly fail in wildly divergent ways • This can represent systemic risk in that many SSDs can fail in the same way at the same time… • Confession: We’ve been so concerned about a flashtastrophy that we have always grossly over-engineered our own SSDs
  • 15. Zebras in the HBA • The host bus adapter is responsible for brokering I/O from the operating system to the physical devices • This is more complicated than it might seem — and in particular, HBA firmware is infamous for losing I/O under load • From the perspective of system software this will be an I/O that never returns — which means it will be timed out and retried • While the system will maintain liveness, this will induce a latency outlier — which can manifest itself far up the stack (e.g., TCP resets!)
  • 16. Zebras in the DIMM • DRAM is a capacitor that must be periodically refreshed • DRAM is susceptible to fatal failures (e.g., corrosion due to humidity, temperature or other environmental failures) • As the speed and density of DRAM have increased (and the voltage has dropped), DRAM has become more susceptible to transient bit failure not due to any hardware malfunction • The “Firmware First” (!) model of error handling in x86 (and the demise of CMCI) is leading to a silent epidemic of DIMM failure!
  • 17. Zebras in the chassis • Even the chassis itself is not immune from software failure • For example, software and firmware control fan speed — and failures in that software can result in fans stuck running at their highest speed • Fans are not designed to run at full power for extended periods of time; they wear out or (worse) induce vibration in the chassis • The effects of (say) vibration will be felt far from the source — and again, may only manifest latency not explicit failure
  • 18. Zebras in the NIC • Failure in the network interface card can be due to NIC firmware failure or hardware failure (e.g., the optical transceiver) • Networking failure should be entirely survivable by a distributed system, but that doesn’t mean it’s without consequence! • Use of the link aggregation control protocol (LACP) seems tempting — but can requires more sophisticated software in the switch (i.e., MLAG)… • …which itself can lead to new failure modes!
  • 19. Zebras in the top-of-rack switch • As their own complicated ecosystem of software and firmware, top-of-rack switches are prone to software failure • Failure in the top-of-rack (or worse, the L3 core) can have an enormous blast radius in a distributed system… • For example, a switch that drops its ARP tables can result in a distributed system going massively split brain… • Or a switch that gets stuck broadcasting traffic can easily DDOS an entire distributed system — revealing that there is a single- point-of-failure after all!
  • 20. Zebras all the way up • These problems do not manifest themselves cleanly at the point of origin for reasons both pragmatic and economic • Hardware vendors don’t want gear shipped back for RCCA! • Arguably, unreliable components allow (force?) upstack software to discover its novel failure modes • But that is an argument for debugging and resolving those (additional) problems upstack, not for unreliable components!
  • 21. Don’t fear the zebra • The data path is not to be undertaken lightly • Do not assume that testing and monitoring can substitute for system understanding; enshrine observability • Reward complete understanding, not merely resolution! • As long as it’s unobservable, firmware is the enemy — and trends toward sophisticated firmware are especially troubling! • Open source software affords us a quality ratchet: we shouldn’t spend our careers re-solving the same problems!
  • 22. Further reading and viewing • For an enlightening (and more positive) take on firmware, check out the amazing videos of Micah Elizabeth Scott (@scanlime) • For a snapshot of what we’re currently working on and thinking about with respect to Manta/Triton, see the Joyent Requests for Discussion (RFDs) — especially RFD 89 (“Project Tiresias”) • For more on node.js debuggability, see Dave Pacheco’s talk on “Industrial-grade node.js” • Also, thank you to Amanda Lundberg of White Coat Captioning for the superhuman real-time captioning!