Illustrative

IndiaGPU · the pitch · real data, dual-track

Reverse-engineer the GPU.
Build it open, in Pune.

India has the world's deepest chip-design talent and billions in semiconductor funding - yet builds no GPU of its own. We turn public GPU intelligence into an open, India-buildable roadmap.

✓ 14 source-cited research pillars · 28 metrics adversarially verified

The problem

One company owns the compute of the AI era

Nvidia's moat isn't a board secret - it's TSMC 4N silicon plus a 15-year CUDA software ecosystem. Meeting it with a 4nm monolith is a fab bet no newcomer wins.

TSMC EUV node behind the RTX 5090 (GB202) - unreachable without a leading-edge fab.

CUDA

The real lock-in: a 15-year software ecosystem, not a single chip.

$0B+

The cost of a leading-edge fab - the wrong door for India to push on first.

Illustrative

The insight - from Korea

Don't clone the GPU. Take the flank Nvidia under-serves.

South Korea didn't copy Nvidia. FuriosaAI and Rebellions built inference NPUs and won on efficiency - backed by a state program reported at ~$50B. The lesson is strategic, not just technical.

0×

FuriosaAI RNGD inference perf/watt vs competitive GPUs on LG's EXAONE; reportedly declined a ~$800M Meta offer.

$0B

Rebellions × Sapeon - Korea's merged inference-first national champion.

$0B

"K-NVIDIA" - multi-year state program for domestic AI chips.

The approach

A dual-track plan - honest about physics

Track A - Buildable in Pune today	Track B - 2035 open-GPU vision
Board-level RE: PCB, VRM, GDDR7 layout, recovered BOM	Monolithic-class open GPU on a maturing Indian node
Open RTL GPU IP (Vortex RISC-V, Tiny-GPU, MIAOW) on FPGA / 130-65nm shuttle	Chiplet open GPU, 2.5D/3D packaging done in-country
OSAT/ATMP packaging & chiplet assembly (Kaynes/TSAT-class)	Sovereign EDA + open PDK pipeline at scale

⚠️ A 4nm 5090-class monolith cannot be made by pick-and-place SMD. Track A is real, fundable work now; Track B is labeled vision - and that honesty is our credibility.

Illustrative

You don't start from zero

A whole GPU from six extra instructions

Vortex turns standard RISC-V into a SIMT GPU - compiler, driver, runtime and RTL - by adding just six instructions, and in v3.0 grows a real 3D pipeline plus tensor cores. It runs on commodity FPGAs today.

new instructions (wspawn, tmc, split, join, bar, tex) added to RISC-V to make a GPU.

AMD GCN instructions implemented by MIAOW - synthesizable, 32nm-validated (9.31 mm²).

GFLOPS from a 32-core Vortex on a Stratix-10 FPGA - open silicon, running now.

Illustrative

Reverse engineering - legitimately

The 5090, taken apart honestly

Public teardowns and JEDEC datasheets let us reconstruct the GB202 board - power, memory, BOM - with no proprietary netlist. We're blunt about the one wall: the silicon is TSMC 4N EUV.

VRM phases (23 GPU + 6 memory) feeding a ~575W TGP - fully analyzable at board level.

PAM3

GDDR7 3-level signaling → ~1.7+ TB/s-class bandwidth. Reproducible board layout.

The die itself - EUV fab only. We label it vision, never a promise.

Illustrative

The build path

Four steps from teardown to taped-out open GPU

Board-level teardown

Reconstruct the GB202 board - power, memory, BOM - from public sources. Reproducible by an Indian board house.

Architecture intelligence

Map SIMT execution, memory hierarchy and the CUDA moat to know exactly what an open core must answer.

Open RTL on FPGA → mature node

Vortex / minimal core on FPGA, taped out on an accessible/open node (SCL Mohali, SkyWater shuttle).

Indian packaging

Assemble & test at Kaynes/TSAT-class OSAT; chiplets bridge to the 2035 vision.

Why now - India

The money and the talent are already here

₹0cr

India Semiconductor Mission (~$9.2B), 50% capex support, 13+ approved projects.

$0B

Micron Sanand OSAT; Kaynes shipped first paid chip modules Oct 2025; Tata TSAT Assam live.

Pune

Design centers for Intel, AMD, Qualcomm, MediaTek, Synopsys, Cadence - the human capital.

The competitor map

Everyone attacking Nvidia - and the one thing each does it can't

Illustrative Track B · 2035 vision (speculative)

Chiplets are the bridge to a sovereign GPU

Disaggregation (UCIe) means you no longer need one giant leading-edge die - you compose smaller dies, some on mature nodes, via standard interconnect. That is exactly the advanced packaging India is funding.

Labeled forward-looking. The path is credible precisely because each step on Track A is real today.

The engine under the hood

A research backend that stays honest

A 5-wave crawl4ai pipeline mines GitHub, Hugging Face, Google Patents and the paper frontier in parallel; a 1.58-bit BitNet model distils it CPU-side into a ChromaDB vector store. Every public claim carries a source and survives an adversarial verify pass.

crawl4ai

50× parallel sourced research, no silent failure.

BitNet 1.58b

Ternary CPU distillation into clean MD.

ChromaDB

Vectorized electronics know-how.

Quality gate

Sourced metrics only; vision labeled.

✓ 90 factors generated · 28 metrics adversarially verified · 1 corrected on the spot

Illustrative

The ask

Oversee the process engineering. Build India's first open GPU.

The IP is open, the packaging is funded, the talent is in Pune. What's missing is a coordinated program to fuse them - board RE, open RTL, FPGA bring-up and OSAT assembly - into a working, documented, sovereign open GPU.

Explore the full teardown →

Reverse-engineer the GPU.Build it open, in Pune.

One company owns the compute of the AI era

Don't clone the GPU. Take the flank Nvidia under-serves.

A dual-track plan - honest about physics

A whole GPU from six extra instructions

The 5090, taken apart honestly

Four steps from teardown to taped-out open GPU

The money and the talent are already here

Everyone attacking Nvidia - and the one thing each does it can't

Chiplets are the bridge to a sovereign GPU

A research backend that stays honest

Oversee the process engineering. Build India's first open GPU.

Reverse-engineer the GPU.
Build it open, in Pune.