India has the world's deepest chip-design talent and billions in semiconductor funding — yet builds no GPU of its own. We turn public GPU intelligence into an open, India-buildable roadmap.
Nvidia's moat isn't a board secret — it's TSMC 4N silicon plus a 15-year CUDA software ecosystem. Meeting it with a 4nm monolith is a fab bet no newcomer wins.
TSMC EUV node behind the RTX 5090 (GB202) — unreachable without a leading-edge fab.
The real lock-in: a 15-year software ecosystem, not a single chip.
The cost of a leading-edge fab — the wrong door for India to push on first.
South Korea didn't copy Nvidia. FuriosaAI and Rebellions built inference NPUs and won on efficiency — backed by a state program reported at ~$50B. The lesson is strategic, not just technical.
FuriosaAI RNGD inference perf/watt vs competitive GPUs on LG's EXAONE; reportedly declined a ~$800M Meta offer.
Rebellions × Sapeon — Korea's merged inference-first national champion.
"K-NVIDIA" — multi-year state program for domestic AI chips.
| Track A — Buildable in Pune today | Track B — 2035 open-GPU vision |
|---|---|
| Board-level RE: PCB, VRM, GDDR7 layout, recovered BOM | Monolithic-class open GPU on a maturing Indian node |
| Open RTL GPU IP (Vortex RISC-V, Tiny-GPU, MIAOW) on FPGA / 130–65nm shuttle | Chiplet open GPU, 2.5D/3D packaging done in-country |
| OSAT/ATMP packaging & chiplet assembly (Kaynes/TSAT-class) | Sovereign EDA + open PDK pipeline at scale |
Vortex turns standard RISC-V into a SIMT GPU — compiler, driver, runtime and RTL — by adding just six instructions, and in v3.0 grows a real 3D pipeline plus tensor cores. It runs on commodity FPGAs today.
new instructions (wspawn, tmc, split, join, bar, tex) added to RISC-V to make a GPU.
AMD GCN instructions implemented by MIAOW — synthesizable, 32nm-validated (9.31 mm²).
GFLOPS from a 32-core Vortex on a Stratix-10 FPGA — open silicon, running now.
Public teardowns and JEDEC datasheets let us reconstruct the GB202 board — power, memory, BOM — with no proprietary netlist. We're blunt about the one wall: the silicon is TSMC 4N EUV.
VRM phases (23 GPU + 6 memory) feeding a ~575W TGP — fully analyzable at board level.
GDDR7 3-level signaling → ~1.7+ TB/s-class bandwidth. Reproducible board layout.
The die itself — EUV fab only. We label it vision, never a promise.
Reconstruct the GB202 board — power, memory, BOM — from public sources. Reproducible by an Indian board house.
Map SIMT execution, memory hierarchy and the CUDA moat to know exactly what an open core must answer.
Vortex / minimal core on FPGA, taped out on an accessible/open node (SCL Mohali, SkyWater shuttle).
Assemble & test at Kaynes/TSAT-class OSAT; chiplets bridge to the 2035 vision.
India Semiconductor Mission (~$9.2B), 50% capex support, 13+ approved projects.
Micron Sanand OSAT; Kaynes shipped first paid chip modules Oct 2025; Tata TSAT Assam live.
Design centers for Intel, AMD, Qualcomm, MediaTek, Synopsys, Cadence — the human capital.
Disaggregation (UCIe) means you no longer need one giant leading-edge die — you compose smaller dies, some on mature nodes, via standard interconnect. That is exactly the advanced packaging India is funding.
A 5-wave crawl4ai pipeline mines GitHub, Hugging Face, Google Patents and the paper frontier in parallel; a 1.58-bit BitNet model distils it CPU-side into a ChromaDB vector store. Every public claim carries a source and survives an adversarial verify pass.
50× parallel sourced research, no silent failure.
Ternary CPU distillation into clean MD.
Vectorized electronics know-how.
Sourced metrics only; vision labeled.
The IP is open, the packaging is funded, the talent is in Pune. What's missing is a coordinated program to fuse them — board RE, open RTL, FPGA bring-up and OSAT assembly — into a working, documented, sovereign open GPU.
Explore the full teardown →