I will fix mediapipe GPU delegate errors on arm linux, docker, or headless

R
richter1976
R
richter1976
Richter

About this gig

Is MediaPipe GPU delegate failing on your ARM device, Docker container, or headless server?


Common errors I fix:

"Failed creating base context during opening of kernel driver"

"eglGetDisplay() returned EGL_NO_DISPLAY"

"Kernel module may not have been loaded"

GPU delegate silently falls back to CPU with no error

MediaPipe works on desktop but crashes on edge/embedded


I compiled MediaPipe 0.10.35 from Bazel source with EGL/GBM GPU delegate on ARM Mali GPU running fully headless (no X11, no Wayland, no Xvfb). Achieved 2.3x speedup over CPU.


What most sellers don't know:

MediaPipe GPU delegate uses EGL, NOT CUDA even on Jetson

EGL requires a display server by default I patched it to use GBM (Generic Buffer Management) for true headless

This works on Mali (RK3576/RK3588), VideoCore (RPi 5), and Adreno GPUs


Live demo (terminal recording): https://asciinema.org/a/Mv4LEGvaroBSs6oJ


I handle:

ARM aarch64 compilation from source (Bazel + CMake)

Docker GPU pass-through for MediaPipe

Headless EGL/GBM patching

Performance benchmarking (CPU vs GPU)


Platform: Python 3.10-3.12, Linux ARM64, Docker-compatible


Get to know Richter

Richter
4.8(4)
  • FromChina
  • Member sinceOct 2024
  • Last delivery1 year
  • Languages

    English, Chinese, German
I build computer vision systems that ship — on NVIDIA CUDA servers and ARM edge. Not demos. Production. 6 projects deployed in 12 months: YOLO detection + tracking on CUDA and NPU (17x speedup), multi-camera RTSP pipelines with FFmpeg hardware decoding, MediaPipe GPU compiled from source for ARM Mali (2.3x faster, headless), PyTorch custom model training, and rPPG contactless vital signs from video. Stack: Python, C++, PyTorch, OpenCV, CUDA, ONNX, YOLO, Docker. GPUs: RTX 4060 Ti, Hailo-8L NPU, Mali-G52. 3600+ lines in a real school. 20K+ lines in a shipping edge AI product.

My Portfolio

Related tags