Zhifei Yang
System Software Engineer @ TikTok

I am a low-level systems software engineer and researcher. Recently, I work on the performance and reliability engineering of CPU/GPU system software for large-scale AI workloads.
I got my bachelor's degree from Harbin Institute of Technology and master's degree from ETH Zürich. After graduation, I spent two years pursuing computer systems research in the University of Chicago and EPFL, before joining TikTok's System Technologies and Engineering team in London, UK.
Publications & Talks
-
Boosting Network Performance of Confidential VM Using Userspace Stack
William Lam, Yuwei Zhang, Zhifei Yang
In DPDK Summit 2024 (video). -
Fast and Secure: DPDK Meets Confidential Computing
Zhifei Yang, Liang Ma
In DPDK Summit 2023 (video). -
Odinfs: Scaling PM performance with Opportunistic Delegation
Diyu Zhou, Yuchen Qian, Vishal Gupta, Zhifei Yang, Changwoo Min, and Sanidhya Kashyap
In Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI '22). -
Motivating High Performance Serverless Workloads
Hai Duc Nguyen, Zhifei Yang, Andrew A. Chien
In Proceedings of the 1st Workshop on High Performance Serverless Computing (HiPS '21).
Experiences
-
System Software Engineer, TikTok.
System Technologies & Engineering - UK, 2023 ~ Now.
Performance tuning and reliability engineering of TikTok/ByteDance's global system software, focusing on high-performance GPU infrastructure and the Linux kernel. -
Research Assistant, School of Computer and Communication Sciences, EPFL.
Supervisors: Prof. Sanidhya Kashyap and Prof. Babak Falsafi. 09.2021 ~ 12.2022.
Studied ways to achieve near-zero virtualization overhead and scalable virtual memory address translation -
Research Assistant, Large-scale Sustainable Systems Group, The University of Chicago.
Advisor: Prof. Andrew A. Chien. 09.2020 ~ 06.2021.
Contributed to Realtime Serverless, a novel cloud software architecture to enhance Function-as-a-Service with application-defined performance guarantee -
Research Assistant and Master Thesis, Systems Group, ETH Zurich.
Advisor: Prof. Gustavo Alonso. 2018 ~ 2020.
Developed various database operations on FPGA for better performance and energy efficiency -
Research Intern, Systems Research Group, Microsoft Research Asia.
Mentors: Dr. Hucheng Zhou and Dr. Lintao Zhang. 06.2016 ~ 04.2017.
Optimized the distributed training performance of a popular ML model: gradient boosting decision tree
Teaching
-
TA of CS-173 Digital System Design, EPFL. Spring 2022.
-
TA of High-level Language Programming, Harbin Institute of Technology. 2014 ~ 2015.
Services
-
OSDI'22 and ATC'22 Artifact Evaluation Committee. 2022.
-
PPoPP'22 Artifact Evaluation Committee. 2021.
To seek, to grind, to create.