[DICE] Efficient Yet General Pretraining for Integrated Circuits

Pretrained LLMs

Large Language Models (LLMs) have clearly reshaped our world at an unprecedented scale. They are increasingly general-purpose problem solvers, and as of 2026, systems like ChatGPT, Gemini, and Claude can write code, solve math problems, and retrieve information better than most people.

A standard pipeline for building production-ready LLMs has two stages: $\textcolor{green}{\texttt{(1) pretraining}}$ and $\textcolor{green}{\texttt{(2) fine-tuning}}$ for specific tasks. Think of pretraining as learning the basic rules of language, and fine-tuning as targeted practice for specialized problems.

Because of this, the core of model pretraining is in making them as “general” as possible, so that they can transfer well across many tasks. Supervised training tends to specialize a model for particular objectives. For example, memorizing every word in the Oxford dictionary does not automatically make someone a good writer. For this reason, prior work has relied heavily on unsupervised learning so models can capture broad statistical patterns in text. This helps pretrained LLMs generalize across diverse language tasks.

Pretrained LLMs

LLMs for Integrated Circuits, is it Efficient?

Key Idea

Proposing Method

Results

Takeaway