The exponential growth of AI models presents a fundamental challenge: how do we continue to scale AI capabilities without proportionally scaling energy consumption and computational costs?
The answer lies in hardware-software co-design — an approach that optimizes AI systems holistically, from the silicon level to the application layer.
The Scaling Challenge
GPT-4 is estimated to have been trained on tens of thousands of GPUs over several months. The next generation of models will require even more resources. This trajectory is unsustainable without fundamental innovations in how we design and deploy AI systems.
Our Approach at IBM Research
At the AI Hardware Center, we're tackling this challenge from multiple angles. Our work on analog in-memory computing promises orders-of-magnitude improvements in energy efficiency for AI inference. Meanwhile, our neural architecture search frameworks automatically discover network architectures that are optimized for specific hardware targets.
The Co-Design Philosophy
The key insight is that hardware and software cannot be optimized in isolation. A neural network architecture that performs well on GPUs may be suboptimal for specialized AI accelerators. Conversely, hardware designed without understanding the computational patterns of modern AI workloads will leave performance on the table.
True co-design requires deep expertise in both domains — and a willingness to rethink assumptions at every level of the stack.

Dr. Kaoutar El Maghraoui
Principal Research Scientist at IBM Research · Adjunct Professor at Columbia University