Run Qwen3.5-4B on AMD NPU
RedditMarch 25, 2026ai

Run Qwen3.5-4B on AMD NPU

Tested on Ryzen AI 7 350 (XDNA2 NPU), 32GB RAM, using Lemonade v10.0.1 and FastFlowLM v0.9.36.

Features

Low-power

Well below 50°C without screen recording

Tool-calling support

Up to 256k tokens (not on this 32GB machine)

VLMEvalKit score: 85.6%

FLM supports all XDNA 2 NPUs.

Some links:

Perf. benchmark: https://fastflowlm.com/docs/benchmarks/qwen3.5_results/

Computer (ASUS) under test: https://www.asus.com/us/laptops/for-home/zenbook/asus-zenbook-14-oled-um3406/

🍋Lemonade server: https://lemonade-server.ai/

FastFlowLM: https://github.com/FastFlowLM/FastFlowLM

Source: Reddit · reddit.com