Dr. Kaoutar <span data-loc="client/src/components/sections/HeroSection.tsx:68" class="text-gold">El Maghraoui

Supernetwork-Based Efficient Mapping of Deep Learning Applications to Mixed-Precision Hardware Using Model Adaptation

Introduces Mixed-Precision Supernetwork (MPS), a unified framework for training mixed-precision supernetworks that seamlessly map deep learning models to heterogeneous analog-digital hardware. MPS produces mappings ~2.2x faster and achieves ~3.4% higher accuracy over fully analog approaches while improving energy efficiency by mapping up to 80% of weights to analog hardware.

H Benmeziane, C Lammie, I Boybat, M Rasch, M Le Gallo, A Vasilopoulos, H Tsai, GW Burr, V Narayanan, K El Maghraoui, A Sebastian

STARC: Selective Token Access with Remapping and Clustering for Efficient LLM Decoding on PIM Systems

A novel approach for efficient large language model decoding on Processing-In-Memory systems, using selective token access with remapping and clustering to dramatically reduce memory bottlenecks.

Z Fan, Y Liu, G Gagnon, Z Liu, Y Hou, H Benmeziane, K El Maghraoui, L Liu

NeurIPS 2025 Workshop Imageomics

Advancing Fluorescence Detection and Ranging in Scattering Media with Mixture-of-Experts and Evidential Critics

New

Advancing Fluorescence Detection and Ranging in Scattering Media with Mixture-of-Experts and Evidential Critics

Introduces a Mixture-of-Experts architecture combined with evidential critics for robust fluorescence detection and ranging in challenging scattering environments, advancing scientific imaging capabilities.

I Erbas, F Demirkiran, K Swaminathan, N Wang, NI Nizam, N Yuan, S Ragab, L Chavez, K El Maghraoui, X Intes, V Pandey

Ultra-Low Precision 4-bit Training of Deep Neural Networks

Novel techniques and numerical representation formats to scale the precision of training systems from 8-bits to 4-bits, introducing adaptive Gradient Scaling for quantized gradients.

X Sun, N Wang, CY Chen, J Ni, A Agrawal, X Cui, S Venkataramani, K El Maghraoui, et al.

A Flexible and Fast PyTorch Toolkit for Simulating Training and Inference on Analog Crossbar Arrays

A comprehensive PyTorch toolkit enabling efficient simulation of analog in-memory computing for neural network training and inference on crossbar arrays.

MJ Rasch, D Moreda, T Gokmen, M Le Gallo, F Carta, C Goldberg, K El Maghraoui, et al.

A Comprehensive Survey on Hardware-Aware Neural Architecture Search

An extensive survey and taxonomy of hardware-aware NAS methods, covering search strategies, hardware metrics, and deployment considerations across diverse platforms.

H Benmeziane, K El Maghraoui, H Ouarnoughi, S Niar, M Wistuba, et al.

ModelOps: Cloud-Based Lifecycle Management for Reliable and Trusted AI

A framework for managing the full lifecycle of AI models in the cloud, addressing reliability, trust, and operational efficiency for enterprise AI deployments.

W Hummer, V Muthusamy, T Rausch, P Dube, K El Maghraoui, A Murthi, et al.

APL Machine Learning 2023

Using the IBM Analog In-Memory Hardware Acceleration Kit

2023

Using the IBM Analog In-Memory Hardware Acceleration Kit

Comprehensive guide to the IBM AIHWKit for neural network training and inference on analog in-memory computing hardware, enabling energy-efficient AI acceleration.

M Le Gallo, C Lammie, J Büchel, F Carta, O Fagbohungbe, C Mackin, K El Maghraoui, et al.

Neural Architecture Search for In-Memory Computing-Based Deep Learning Accelerators

Nature Reviews EE 2024

2024

Neural Architecture Search for In-Memory Computing-Based Deep Learning Accelerators

A review of NAS techniques tailored for in-memory computing accelerators, bridging the gap between neural network design and emerging hardware paradigms.

O Krestinskaya, ME Fouda, H Benmeziane, K El Maghraoui, A Sebastian, et al.

Deep Compression of Pre-trained Transformer Models

NeurIPS 2022

2022

Deep Compression of Pre-trained Transformer Models

Techniques for dramatically compressing pre-trained transformer models while maintaining accuracy, enabling efficient deployment on resource-constrained hardware.

N Wang, CCC Liu, S Venkataramani, S Sen, CY Chen, K El Maghraoui, et al.

Multi-Objective Hardware-Aware Neural Architecture Search with Pareto Rank-Preserving Surrogate Models

ACM TACO 2023

2023

Multi-Objective Hardware-Aware Neural Architecture Search with Pareto Rank-Preserving Surrogate Models

A novel multi-objective NAS framework using Pareto rank-preserving surrogate models to efficiently discover optimal architectures across multiple hardware constraints.

H Benmeziane, H Ouarnoughi, K El Maghraoui, S Niar

View All Publications on Google Scholar

Teaching

Shaping the Next Generation

As Adjunct Professor at Columbia University, I bridge cutting-edge IBM Research with graduate education — equipping students to design, optimize, and deploy AI systems at scale.

Teaching Philosophy

Systems Thinking

Understanding how components interact at scale — from silicon to software stack. Students don't just learn algorithms; they implement them on real hardware and measure the results.

Empirical Rigor

Every claim must be measured and validated. Courses emphasize profiling, benchmarking, and performance analysis as first-class skills alongside theoretical foundations.

Ethical Awareness

Considering the societal implications of AI systems. The energy cost of training, the accessibility of deployment, and the responsibility of building technology that serves everyone.

Institution

Columbia University

Courses

2 Graduate

Focus

AI Systems & HPC

Approach

Research-Driven

In the Classroom

Teaching GPU Architecture at Columbia University

Lecture on Model Pruning — HPML at Columbia

Guest Lecture at University of Sharjah

Graduate Course

High-Performance Machine Learning

COMS E6998 · Columbia University

Syllabus

At the intersection of AI and High-Performance Computing, this course covers foundational and advanced techniques that drive efficient AI systems — from GPU programming and distributed training to LLM serving and model compression. Based on PyTorch and CUDA.

PyTorch & CUDA Programming

Distributed Training at Scale

LLM Serving (vLLM, PagedAttention)

Model Compression & Quantization

Research Seminar

Scaling LLMs: Systems, Optimization & Emerging Paradigms

COMS E6998 · Columbia University

Syllabus

A frontier research seminar exploring scaling, optimizing, and deploying large language models through a structured progression from foundations to futures. Students present and critique top-tier papers (NeurIPS, ICML, ICLR, ISCA, ACL) and produce a survey paper with experimental evaluation.

Top-Tier Paper Critiques

Agentic AI & Multimodal Models

Hardware Futures (Analog, NorthPole)

Publish-Ready Survey Projects

Speaking & Videos

Distinguished Speaker

Over 60 keynotes and invited talks at major international conferences, inspiring audiences worldwide on AI, technology, and innovation.

Agentic AI: From Models to Autonomous Intelligence

Keynote

Featured Talk

Agentic AI: From Models to Autonomous Intelligence

Orange Morocco Agentic AI Day — Trust the Future!

Tracing the evolution from large language models to autonomous AI agents — covering the reasoning loop, tool calling, memory architecture, and orchestration patterns that separate production deployments from demos. Featuring measured results from enterprise agentic systems and a practical framework for Orange Morocco's customer care, network operations, fraud detection, and HR onboarding.

Casablanca, Morocco

Mar 2026

Agentic AI: From Models to Autonomous Intelligence

Orange Morocco Agentic AI Day — Trust the Future!

CasablancaMar 2026

Agentic AI: From Creation to Collaboration

Women in AI Morocco / INPT Workshop

INPTDec 2025

The Next Wave: Reinventing Intelligence and Compute Architecture

Women in AI Morocco Summit 2025

TechnoparkDec 2025

Agentic AI Workshop — Full Auditorium

Women in AI Morocco / INPT

INPTDec 2025

The Future of AI — An IBM Research Perspective

IBM TechXChange

Johannesburg & Cape TownAug 2025

Revolutionizing Enterprise AI: The Power and Promise of Foundation Models

Women in Research Webinar Series (QUWA) — University of Sharjah

SharjahApr 2025

Revolutionizing Enterprise AI

IEEE Services Conference

ShenzhenJul 2024

Scaling Foundation Models for Enterprise

MoroccoAI / Al Akhawayn University

Ifrane2023

Foundation Models at Scale

AI Seminar Series — Alfaisal University

RiyadhFeb 2023

Women in Computing: Breaking Boundaries

ArabWIC Conference

RabatMar 2019

AI for Business: A Unique Set of Challenges

Women in Data Science @ Stanford / KACST

Riyadh2018

Watch Keynotes

Selected keynote recordings from major conferences and events.

Keynote2020

Al Akhawayn University — Alumni Journey

A personal reflection on the journey from Al Akhawayn University in Morocco to IBM Research, sharing how the university's liberal arts education and international environment shaped a career in AI and computer science.

Al Akhawayn UniversityYouTube

Keynote2021

Powering the Future of AI through Specialized Hardware

Keynote at MoroccoAI Annual Conference discussing how specialized AI hardware accelerators are essential for sustainable and efficient AI, covering analog in-memory computing and hardware-software co-design strategies.

MoroccoAI Annual ConferenceYouTube

Keynote2021

Accelerating, Optimizing, and Automating AI across the Stack

A comprehensive keynote on the challenges of deploying complex AI models efficiently, covering optimization techniques from hardware to software, and automated approaches to neural architecture search.

ECOLE DES PONTSYouTube

Keynote2022

Platform for Next Generation Analog AI Hardware Acceleration

Presentation at the tinyML On Device Learning Forum on building platforms for next-generation analog AI hardware, enabling efficient on-device inference through novel computing paradigms.

tinyML ForumYouTube

Keynote2021

Women in Services Computing — Award & Keynote

Award acceptance speech and presentation at the IEEE International Symposium on Women in Services Computing, highlighting contributions to AI research and inspiring the next generation of women in technology.

IEEE WISC 2021YouTube

On the Airwaves

Regular contributor to IBM's Mixture of Experts podcast, discussing the latest trends in AI hardware, model optimization, and the future of intelligent systems.

IBM Mixture of Experts

A weekly podcast where IBM researchers break down the latest in AI, technology, and innovation.

Spotify Apple Podcasts

Anthropic's Project Glasswing, AI Profitability & GPT-1900

Listen Now Read the Blog Post

IBM Mixture of Experts

Anthropic's Project Glasswing, AI Profitability & GPT-1900

Kaoutar joins Tim Hwang and Martin Keen on why Anthropic won't release its Mythos model and what Project Glasswing actually means for AI security. The panel also breaks down OpenAI vs. Anthropic financials: both are burning cash on inference, but with very different bets on where the money comes from. And researchers built GPT-1900, trained only on pre-1900 knowledge, to see if AI can rediscover scientific breakthroughs. Turns out, pattern recognition is the easy part.

Agents in Science, Market Convergence & AI Infrastructure

IBM Mixture of Experts

Agents in Science, Market Convergence & AI Infrastructure

Kaoutar joins host Tim Hwang, Ritika Gunnar, and Volkmar Uhlig for the milestone 100th episode, tracing AI's evolution from GPT-2 to GPT-5.3. Kaoutar weighs in on whether AI is truly democratizing expertise — sparked by a homeowner who used ChatGPT to outsell realtor estimates — and examines why only 2.1% of scientists actively use AI coding tools in research. The panel also dissects how Adobe's CFO built an AI lab inside his finance team using autonomous agents for forecasting and contract analysis, with Kaoutar identifying the three hottest areas for enterprise AI adoption.

AI Code Security: Codex Agents & Crypto Mining

Listen Now Read the Blog Post

IBM Mixture of Experts

AI Code Security: Codex Agents & Crypto Mining

Kaoutar joins Tim Hwang, Ambhi Ganesan, and Sandi Besen to analyze OpenAI's Codex Security launch, Anthropic's eval-aware Opus 4.6, Meta's Moltbook acquisition, and Alibaba's rogue crypto-mining agent.

IBM Mixture of Experts

AI Year in Review: Trends Shaping 2026

Kaoutar unpacks the AI hardware supply crisis and NVIDIA's chip dominance, while Gabe Goodhart defends open source's breakout year with models challenging proprietary systems.

IBM Mixture of Experts

Mainframe Modernization: COBOL and AI

Millions of lines of COBOL still run the world's banks and airlines. This episode looks at how AI is starting to modernize those mainframe systems without rewriting everything from scratch.

IBM Mixture of Experts

AI Hardware Model Optimization

Why do some models run fast on one chip and crawl on another? This episode gets into the hardware side: how chip architectures shape what's possible and what model optimization actually looks like in practice.

Manus, Vibe Coding, Scaling Laws & Perplexity's AI Phone

IBM Mixture of Experts

Manus, Vibe Coding, Scaling Laws & Perplexity's AI Phone

Manus made waves as an AI agent, vibe coding became a thing, and Perplexity announced it's building a phone. The panel also debates whether scaling laws are hitting a wall or just bending.

Your Brain on ChatGPT & Human-like AI for Safer AVs

2024

IBM Mixture of Experts

Your Brain on ChatGPT & Human-like AI for Safer AVs

What happens to your brain when you offload thinking to ChatGPT? The panel covers new research on cognitive effects of LLMs, plus how human-like AI reasoning is making autonomous vehicles safer.

IBM Mixture of Experts

Apple's WWDC, Meta & Scale AI, o3-pro

Apple dropped some big AI moves at WWDC, Meta teamed up with Scale AI, and o3-pro showed what reasoning models can do now. The panel also gets into fault-tolerant quantum computing and why it matters sooner than you'd think.

Quantified Strategy•Jan 2026

In the Press

Media & Coverage

Expert commentary, profiles, and features in leading technology publications and organizations worldwide.

Featured In

Featured Press Clippings

Les équipes diversifiées produisent des solutions plus robustes, créatives et équitables

2 pages

Interview

La Nouvelle Tribune•Mar 2026

Les équipes diversifiées produisent des solutions plus robustes, créatives et équitables

An in-depth interview on AI infrastructure, the IBM Spyre accelerator, hardware-software co-design, and the importance of diverse teams in shaping AI's future.

TelQuel Impact — Puts IBM on the AI Radar & 20 Leadership Perspectives

2 pages

Profile

TelQuel Impact•2025–2026

TelQuel Impact — Puts IBM on the AI Radar & 20 Leadership Perspectives

From IBM's research centers in the United States, Kaoutar El Maghraoui is one of the rising figures in the global race for artificial intelligence. Featured among 20 distinguished Al Akhawayn University alumni shaping the future.

GHC Program Co-Chair — Kaoutar El Maghraoui

Leadership

AnitaB.org / Grace Hopper Celebration•2015

GHC Program Co-Chair — Kaoutar El Maghraoui

"In the U.S., there's a misconception that computer science disciplines are only for boys. If you look at children's programming, you rarely see girl characters who are into computers." — Profiled as GHC Program Co-Chair, highlighting her leadership of the world's largest gathering of women technologists and her work mentoring young girls in computing.

How Arab Women in Technology Inspire Global Diversity in Tech

Profile

AnitaB.org / ArabWIC•2016

How Arab Women in Technology Inspire Global Diversity in Tech

"When I was growing up in Morocco, I never felt that we were a minority in schools. Many women pursued computer science and engineering. But when I came to the U.S. to get my Ph.D., it hit me: very few women were pursuing a computer science graduate degree." — An in-depth interview on ArabWIC's growth to 17 Arab countries and the global fight for diversity in tech.

La Nouvelle Tribune•2026

Profile

Women in AI — La Nouvelle Tribune Special Issue

Featured in a special issue celebrating women leaders in artificial intelligence, highlighting contributions to AI hardware and systems research.

AI Year in Review: Trends Shaping 2026

Highlights diminishing returns of pure compute scaling, urging efficiency in training and inference — core to IBM's AI Hardware Center focus.

IBM Think•2025

All eyes on AI at Apple WWDC: Can slow and steady still win the race?

Apple made a significant leap, with the introduction of Apple Intelligence, which combines intelligent systems and personalized contexts.

IBM Think•2025

Custom chips drive AI's future

Reducing dependence on NVIDIA merely shifts the center of power from one giant to another.

IBM Think•2025