Worked on data flow tracking service used to comply with privacy focused government regulations and avoid billions of dollars in fines.
Redesigned platform for higher scalability and reliability. Contributions include new scalable MySQL schema, topology, and sharding strategy that supports 5x the existing workload.
Designed and implemented easier data ingestion process chosen by 24 users to manage 3000+ ingestion tasks.
Designed and implemented new subsystem to make 2TB of lineage per day available offline for privacy use cases. Compared to the old system I improved runtime from twenty-four to three hours, reliability from 89% to 98%, and reduced CPU 10x.
Designed metrics to measure parity of new system against pre-existing system and led #1 team goal of driving these metrics up to 99.9%.