Local AI Models Are Surprisingly Good at Code Generation

"I need a script that will give me at least 1042 distinct but made up show names. They should be funny and grammatically correct and written in TypeScript"

        August 23, 2025

Local AI Models Are Surprisingly Good at Code Generation

            I had an unexpected discovery recently that I wanted to share with you.
I needed a small script to help anonymize some test data. Nothing fancy, but on my way to ChatGPT I ended up on the "wrong window" — and threw the prompt at Ollama's gpt-oss instead. 
And it surprised me by giving me a solid result quickly.
Small local models are the future and being able to run them on hardware you already own is a political statement.
No remote API calls. No burning through tokens/the environment. No company watching. Like a cowboy, just me and my machine.
The Experiment
I ended up testing the same prompt across several different local models and grading the results:

"I need a script that will give me at least 1042 distinct but made up show names. They should be funny and grammatically correct and written in TypeScript"

I expected gpt-oss:20b to be the best of the lot, but surprisingly the 5-month-old llama3.2 crushed everything on the time dimension.
Key Findings

4 out of 7 models got winning results on the first (and only) try
gpt-oss:20b delivered the highest quality code (4/5 rating)
llama3.2 was the fastest by a significant margin
3 models failed to execute properly (mostly syntax errors)

The most surprising takeaway? These small, local models are good enough to be useful and powerful enough to matter for everyday coding tasks.
Why This Matters
We're entering an era where you don't need to send every coding question to a remote API. Local models give you:

Privacy: No company watching your prompts
Speed: No network latency
Cost: No token burning
Independence: Your hardware, your rules

The test code and full analysis is available on GitHub, and you can read the complete post with all the detailed results and code quality breakdowns on my site.

Read the full post with detailed results and code examples →
What local models have you been experimenting with? I'd love to hear about your experiences.

                                Don't miss what's next. Subscribe to The Focus AI:

            Share this email:

                            Share on Facebook

                            Share on LinkedIn

                            Share on Reddit