Everyone seems to be building their own MCP server but there didn’t seem to be any Dart/Flutter MCP clients yet so as a learning experience and in the hopes of making one customised to my own needs eventually, I started working on one recently and wrote up my experience on it so far: Diving Deep: From Zero to a MCP *Client* in Flutter | by Maksim Lin | Jul, 2025 | Medium
Looking forward to the next installment! My little bespoke MCP setup stands and falls upon semantic (vector) search and I know I have much to improve.
Also, I’m interested in using a local LLM but not if it’d mean a significant downgrade in capabilities (from Claude). Do you have any recommendations? Which model are you using and for what?
I “vibe coded” a stop gap solution using GloVe embeddings. It needs a lot of work and I’m sure I’d get better results from a dedicated vector database but the upside is that it only depends on the GloVe data (a text file) and I have 100% control.
@csells I am familiar with Darts MCP server, I actually mentioned that it was @filip post about it here that started me off down this road. But as I mentioned, BlueDjinn is a client not a server and afaik its the only one built in Flutter.
@filip thanks for the pointer to GloVe! I’ll have to check it out as my (maybe naive) plan was just to use sqlite with its sqlite-vec extension to store the RAG embeddings and have them generated with the Ollama’s all-minilm which Ive seen recommended for generating embeddings.
In terms of performance, theres definitely a noticable gap between the 20-30b 4bit quant models I can run locally vs the frontier models of Gemini Pro 2.5 or Claude.
But the again Im only dipping my toes in these waters so I only got a m2 studio with 32GB that happened to be on clearance pricing at a local shop so I’m limited to allocating around 30GB for the GPU and I expect I would get much better performance if I stumped up for either a studio ultra or the new AMD 295+ machines with 128GB which allow running all but the very largest open wieghts models, albeit still in quantised form.
If anyone does have access to a machine with enough unified ram to run those larger models, I’d love to know how well they do.
But in practical terms: even running a tiny 1.5B qwen-coder is good enough for me for auto-complete inside VSC though as I said above, the analysis and code gen of things like qwen-coder3-32B or gpt-oss-20B is impressive but not as good as the frontier models, but then again I have unlimited “free” access to my local LLMs and pay a pitence in electricity, only ~45W for the M2’s GPU!! while even if I have invested in a 4090 with only 24GB I’ve read you need a 600w power supply! so at least the greenie in me feels better that Im not wreaking the environment when I use my local LLMs
Oh hey @filip ! I just had a quick look at bespoke and from an initial glance, it looks like a more general purpose version of what I had in mind for BlueDjinn which I was only thinking of in terms of software dev work, I shall have to up my ambitions with it as I really like the idea of adding a basic MCP server into my own “knowledge db” though mine is just a big dir of plain text and markdown files vs using Obsidian.
Thanks heaps for publishing your code, being able to look at your (or your LLMs ) code for GloVe as a starting point will definitely make it easier to try it out
And thanks for the pointer to Dartantic! I hadn’t seen it before, looks like it may already have some features that I need, so will save me reinventing some parts of that wheel
Guys, A quick question related to this:
I’m trying to developer a “Figma to Flutter MCP Server”, so if I’m able to have a dart package for it, would that count as “Figma to Flutter MCP Client”? And should I do it or it will just be an unneccessary effort?
Oh okay, but how do you handle the assets part e.g. if any design has an image or somthing? And the colors/text part. (I’m feeling I’ve wasted my week developing this if your way of doing this is working)
This may be a stupid question (and it’s definitely OT) but how do you use a local LLM for auto-complete? Is this a VS Code feature? (I’m on IntelliJ so I’m not familiar but maybe I could get a similar setup going.)
And does your local qwen-coder LLM work okay with Dart/Flutter codebases?
Not at all, there are several VSCode plugins that provide this feature. The one Im currently using is Twinny which is nice in that it lets you set different models to use for auto-complete vs chat/agentic tooling so I use a small 1.5B param qwen-coder model for autocomplete and then play around using different bigger 20-30B models for chat/agentic tooling. I’ve alos used another one called Cline but found it a bit too much focused on using online models.
And unfortunately I’ve havent had a chance to try out local LLMs for Dart/Flutter properly or even online frontier models as I’ve lately been mostly doing embedded c++ and actually neither the online or local models are particularly good at that, so maybe there would be a bigger diff in perf for dart codebases, I’m hoping to try that out soon though as I need to work on more features for the Flutter companion app for the embedded device soon (you know the one Im talking about @filip ) and will report back here how well it works.
Using contents of this forum for the purposes of training proprietary AI models is forbidden. Only if your AI model is free & open source, go ahead and scrape. Flutter and the related logo are trademarks of Google LLC. We are not endorsed by or affiliated with Google LLC.