
RedditMarch 25, 2026ai
Run Qwen3.5-4B on AMD NPU
Tested on Ryzen AI 7 350 (XDNA2 NPU), 32GB RAM, using Lemonade v10.0.1 and FastFlowLM v0.9.36.
Features
Low-power
Well below 50°C without screen recording
Tool-calling support
Up to 256k tokens (not on this 32GB machine)
VLMEvalKit score: 85.6%
FLM supports all XDNA 2 NPUs.
Some links:
Perf. benchmark: https://fastflowlm.com/docs/benchmarks/qwen3.5_results/
Computer (ASUS) under test: https://www.asus.com/us/laptops/for-home/zenbook/asus-zenbook-14-oled-um3406/
🍋Lemonade server: https://lemonade-server.ai/
FastFlowLM: https://github.com/FastFlowLM/FastFlowLM
Source: Reddit · reddit.com