Analog Multilevel eDRAM-RRAM CIM for Zeroth-Order Fine-tuning of LLMs
Published in imw 25, 2025
Recommended citation: M. Chen, L. Zheng, J. -Y. Lin, P. D. Ye and H. Li, "Analog Multilevel eDRAM-RRAM CIM for Zeroth-Order Fine-tuning of LLMs," 2025 IEEE International Memory Workshop (IMW), Monterey, CA, USA, 2025, pp. 1-4, doi: 10.1109/IMW61990.2025.11026966. keywords: {Large language models;Field effect transistors;Metals;Programming;In-memory computing;Common Information Model (computing);Silicon;Reliability;Optimization;Method of moments;Compute-in-memory (CIM);eDRAM;RRAM;MLC;oxide semiconductors;LLM fine-tuning},
Abstract: Zeroth-order fine-tuning eliminates explicit back-propagation and reduces memory overhead for large language models (LLMs), making it a promising approach for on-device fine-tuning tasks. However, existing memory-centric accelerators fail to fully leverage these benefits due to inefficiencies in balancing bit density, compute-in-memory capability, and endurance-retention trade-off. We present a reliability-aware, analog multi-level-cell (MLC) eDRAM-RRAM compute-in-memory (CIM) solution co-designed with zeroth-order optimization for language model fine-tuning. An RRAM-assisted eDRAM MLC programming scheme is developed, along with a process-voltage-temperature (PVT)-robust, large-sensing-window time-to-digital converter (TDC). The MLC-eDRAM integrating two-finger MOM provides 12× improvement in bit density over state-of-the-art MLC design. Another 5× density and 2× retention benefits are gained by adopting BEOL In2O3 FETs.
