Session Title
Session Title
Running LLMs in Node.js: Wrapping GGML for High-Performance AI
Abstract
Python currently monopolizes the AI landscape, but the heavy lifting is actually done by C++ libraries like GGML—the tensor engine behind the revolutionary llama.cpp. Node.js developers shouldn’t have to spawn Python processes to access this power.
In this talk, we will build a high-performance tensor addon from scratch using Node-API. We will wrap the GGML C++ library to perform matrix multiplications and tensor operations directly in Node.js, achieving native-speed inference. You will learn how to structure ABI-stable addons, manage memory between V8 and C++, and finally prove that JavaScript is ready for the AI era.
Description
The “AI Engineering” stack is shifting. It’s no longer just about calling OpenAI APIs; it’s about running efficient, local inference. While Python has torch and numpy, Node.js has often been left behind—until now.
In this deep-dive session, we will:
- Deconstruct the Problem: Why is Python “fast” for AI? (Hint: It’s just C++ bindings).
- Meet the Tooling: Introduction to GGML (the library that powers local LLMs) and Node-API (the stable interface for V8).
- Live Code the Solution:
- Setting up
node-addon-apiwithout the “V8 headers” headache. - Wrapping a
ggmltensor operation (like matrix multiplication). - Benchmarking: JS implementation vs. Python/NumPy vs. Our Node+GGML Addon.
- Setting up
- Advanced Techniques: Handling heavy computation without blocking the Node.js Event Loop (using
uv_queue_workorNapi::AsyncWorker).
You will walk away with a working pattern to integrate any C++ AI library into your Node.js stack, breaking the Python dependency chain.
Key Outcomes
After attending this talk, audience members will be able to:
- Understand the architecture of AI libraries (C++ core + high-level bindings) and replicate it in Node.
- Use Node-API to wrap the GGML tensor library for native-speed matrix operations.
- Build performant addons that don’t block the Node.js Event Loop.
- Create ABI-stable binaries that survive Node.js version upgrades.
Target Audience
Intermediate to Advanced Node.js engineers. No PhD in Math required—just a curiosity about how to make Node.js go fast and how modern AI libraries work under the hood.