a.danks@gmail.com
Open to hybrid/remote H1B Transfer eligible
Andrew Danks
Platform & Data Infrastructure architect with 12+ years of experience scaling distributed systems, orchestration and streaming platforms, cloud infrastructure, and ML systems
Sr Staff Software Engineer Affirm
Tech Lead, Batch & Streaming Data Platforms — Led core data infrastructure teams serving ~1000 engineers. Owned strategy and cross-functional alignment for Kafka, Spark, Temporal, Airflow, Flink
Established company-wide paved paths, design constraints & guardrails for safe, scalable platform adoption.
Stepped into management calibration; reviewed promo packets and drove multiple Junior to Staff-level promotions
Reduced team on-call load by 50% via self-serve automation and leadership accountability reporting
Overhauled company-wide System Design interview template for consistent, higher-signal evaluation
Temporal Platform — Designed, secured VP + eng-lead alignment, and delivered production-grade Temporal platform to GA, powering stateful, durable Agentic/LLM harnesses and Capital pipelines
Kubernetes Migration — Designed & led cutover of 2000+ critical jobs to 17 EKS clusters, peak 500TiB memory. Reduced env provisioning (2 months to 1 week), standardizing DevEx, 50% faster deploys
Platform Reliability & Risk Mitigation
Architected usea1→usea2 multi-region deployment for financial pipelines for EC2 control plane redundancy
Built a data-quality platform that blocks pipeline regressions, safeguarding billions from financial discrepancies.
Designed a SLA tracking system to ensure platform and financial pipelines are meeting contractual SLAs
Platform Modernization & Cost Efficiency
Luigi and Celery → Temporal: Org alignment, designed adapter for zero-code migrations. Dramatically improving reliability, o11y, and developer velocity. Presented at Replay 2026
Kinesis → Kafka: Managed project and designed zero-data-loss Kafka consumer cutover. $3M/year savings
Automated detection of over-provisioned Spark workloads. $2M/year savings
Search Suggest + ML — Led re-architecture of autocomplete ranking with new ML platform and xgboost classifier for contextual filters. Improved click-through rate while maintaining low latency.
Realtime ML model for Store Visits — Improved Flink app throughput by >1000 msgs/sec for online ML model classifying customer visits from location pings. Featured at Flink Forward 2019
Chain Detection — Built Spark + ML system to detect retail chains at scale. Presented at PyBay 2018
Previous Roles
Software Engineer Intern, Yelp (Jun 2013 – Aug 2013, San Francisco)
Software Engineer Intern, Marin Software (May 2012 – Aug 2013, San Francisco)
Research Assistant, Computational Linguistics Group (Sep 2013 – Jun 2014, Univ. of Toronto)