February quick-takes
Recap of my short posts on LinkedIn in February
AI Slop in Content Writing
Dear bloggers, content writers, commentators and social media managers: I know you like and use AI. For real. It's screamingly obvious. It's so obvious it screams slop – and the blame isn't on AI, it's squarely on you.
Even if the underlying idea or thought is your original, when you apply AI lipstick to it, you sabotage your own reputation. When someone sees an obvious AI post, they immediately discount everything you're trying to convey, even if it does have merit on its own.
Fortunately, this is easy to fix. Here are a few obvious tells that AI shadow-wrote it for you:
- You copy-pasted the “Would you like me to expand on this?” followup ChatGPT gives you without even reading the content. This is not just a red flag, this is cause for excommunication!
- Your post is full of em-dashes. Cm'on, you never even heard of it before, admit it.
- Lose the emoji. It was tone-deaf even before it became a sure sign of LLM authorship.
- The “It's not X. It's Y” contrasts are sometimes needed, but AI dials that to eleven.
- While I'm at it, read your text aloud: if a friend would ask for your health to hear you utter that, reword it.
- In general, if your text sounds like a TED talk, that's a bad sign, even if you 100% wrote it manually.
- Thank God Ghibli memes are out of fashion, but if I had a dime every time I see an image of a laptop with screen on the outside lid, or an unmistakable GPT-Image-style cartoon, I wouldn't need to be on LinkedIn anymore.
- Yes, ChatGPT and Gemini can do infographics. The results are crowded, hard to read, and boring to boot. I'd rather suffer through an emoji-riddled listicle instead.
- If you profess to be an AI expert and offer tips, tricks, workshops or prompt secrets to other, the above applies doubly to you. Low effort means not only you generate slop, you also sell slop.
There are better ways to leverage this wonderful new tech, that don't insult your readers' intelligence.
On the Slow Death of Scaling
On the slow death of scaling is an interesting essay about alternatives to “bigger is better” approaches in modern AI research.
It's a nuanced one and easy to misread: if you're an AI believer it's easy to retort with “scaling just shifted to inference time!”, and if you're a doomer you can point and say “see, exponential cost and environment impact for diminishing returns!” It doesn't say either of these things.
The “scaling laws” or “the bitter lesson”, or “when in doubt, use brute force” refer to the fact that it's often better and easier to solve the problem by applying a bigger hammer (or a graphics card, or a data center).
The essay just states it's not always the case, documents smaller LLMs that obviously outperform larger ones and lists several areas where compound approach improvements end up being better than pure weights/data/compute scaling: better (synthetic) data, chain of thought, distillation, reasoning, tools, RAG, agents...
I read it as an optimistic look into the future where, free from “just increase the size by 10x” arguments, researches can invent even better ways of doing AI.
But don't take my word for it – the essay is an easy read, no hard math, and only 12 pages long (the rest are references). Worth your time if you're into this stuff.
AI Coding, Mediocrity and the Elephant in the Room
Earlier today I had a chat with a friend (also a seasoned senior developer) about the future of coding (in the next year or so) and the implications for software quality.
We're all mostly concerned with whether AI can match human developers in terms of software quality, but the elephant in the room is our assumption that most code today is of good quality. And, to be frank, about the skill level of a median developer.
Median developer is mediocre by definition, and half are even worse than that!
Between the two of us and over many years, my friend and I saw a lot of code in various companies all over the world, from scrappy startups to BigCos, written by many different people.
Large amounts of said code were human-generated slop slapped together by mediocre coders, who weren't really interested in crafting beautiful art, and/or had tight deadlines and uptight bosses who wouldn't let them even if they had an inclination to.
When we talk about AI not being able to match the art and ingenuity of expert developers with lot of time on their hands, we raise the bar several notches higher than we hold it for a large number of human coders.
AI may not top the results of “A players” or “10x devs” or “top performers” (as startup gurus like to call the best software engineers out there), but they probably match “B players”, “1x devs” and “meets expectations” at 1% the price, and are already better than “C players”, “0.1x devs” or “needs improvement” coders.
This is not to insult anyone (or everyone), but to remove our rose-tinted nostalgia glasses looking at humans as somehow all being masters of their craft and bursting with creativity, skill and inspiration all the time.
BTW this applies wider than coding. I laugh sardonically every time I hear or read about “AI slop” on the internet, like we haven't had to endure decades of “human slop”, be it in form of text, video or code.
If AI slop is the death of the internet, we've been hooked to a zombie for a time now.
Cijene API Talk: Scraping, Billions of Prices, and Croatian Law
Last year I gave a few talks about Cijene API, a daily price aggregator API for Croatian retail prices.
I've now uploaded a standalone version of the talk to YouTube.
If you're interested in war stories of data scraping and validation, managing billions of prices on a single server, and peculiar Croatian legislation that motivated that in the first place, check it out.
Karpathy's Rapid Shift to 80% Agent-Driven Coding
Andrej Karpathy on AI-assisted coding:
I rapidly went from about 80% manual+autocomplete coding and 20% agents in November to 80% agent coding and 20% edits+touchups in December. [...] It hurts the ego a bit
Andrej was recently on record (on Dwarkesh podcast) saying AI coding agents were not good enough for non-boilerplate coding tasks, which makes the new post so surprising (and yet not, to people watching closely what's been happening with the coding agents).
Anecdotally, I've been hearing the same from many senior software developers who know their stuff (ie. aren't just blinded by hype). The development workflows are being structurally redefined very quickly.
In my recent talk on AI-assisted software engineering I tongue-in-cheek joked that the information has a “best before” date of a couple of months. Seems right on spot ... I'll need to update the talk (again).
Security Warning: Don't Run Clawdbot/OpenClaw Without Precautions
Public Service Announcement: Do not run Clawdbot (OpenClaw), unless you really, really, really know what you're doing.
Clawdbot (now renamed to OpenClaw to avoid infringing Antrhopic's Claude trademark) is an AI assistant that you can hook up to your email, whatsapp, telegram, files, and let it, ...well, assist you.
There's a huge hype building up around it, so you might be tempted to try it out (maybe on a spare computer, as a precaution). Don't – not in its current state.
The thing basically runs “YOLO” on all your data and can act without your permission. This is extremely dangerous, as it is trivial to do major damage (intentionally or not) using it.
In the words of Simon Willison:
This project terrifies me. On the one hand it really is very cool, and a lot of people are reporting great results using it. But it's an absolute perfect storm for prompt injection and lethal trifecta attacks. People are hooking this thing up to Telegram and their private notes and their Gmail and letting it loose. I cannot see any way that doesn't end badly.
We are certain to hear about some major security problems from people using this.
If you really know what you're doing, and properly manage access to your resources, sure, you can check it out. But most will just hook it up to everything and basically play russian roulette with their data. Even if run on a separate computer, if you give it access to your mail, messaging, and cloud files, you can still be royally screwed.
AI assistants (ChatGPT, Claude, or home-grown rigs) have been able to do this for months (or even years) now, but until now we've collectively been mostly careful enough about giving them access permissions. There have been whole startups, such as Arcade.dev built around this.
Clawd basically throws all that caution away in a “look, no hands!” move.
New Open Models: Kimi K2.5, Qwen3-Max-Thinking, Trinity Large, Z-Image
With Kimi K2.5, Qwen3-Max-Thinking Trinity Large and Z-Image, this has been an interesting week for open AI models:
Kimi K2.5 is an upgrade from K2 by pretraining it with additional ~15T visual and text tokens. It reportedly improves coding and vision capabilities and supports “agent swarm” operation (many agents collaborating on a task).
The Kimi team also released a coding app (ala Claude Code / Codex) and a mobile app.
Qwen3-Max-Thinking is sadly not an open model, but it's still a notable update (and I hope there's going to be a small, open, distilled version in the near future). Also an improvement on the existing Quen3-Max by scaling up the number of parameters and additional RL.
Z-Image is an image generation model from Alibaba. A month ago they released a small ~6B version (Z-Image-Turbo) and now they have an update for the main (large) model.
Not to be outdone by the Chinese labs, the US lab Arcee AI released Trinity Large, a new open model with 400B parameters (13B active – 4 out of 256 experts in use). An annoucement blog post contains many technical details.
The SpaceX-xAI Merger: Untangling the Financial Logic
Startups, tech, financial, AI, and buble WTF of the week, all rolled into one: SpaceX in talks to merge with xAI.
Okay, let's unparse this:
SpaceX is in the business of building and launching rockets, and providing satellite internet. It's profitable, held in high regard, dominates over other launch providers, Starlink is a major success, and the company is on its way to IPO (and Mars, albeit that's still in the far future).
xAI is the controversial company behind Grok and tied to X (ex Twitter).
Both are private companies, controlled by their majority owner Elon Musk. Besides that, they have nothing else in common.
The rumored intention is to offload the huge debt that Musk initially took on for buying Twitter, first to xAI (diluting it in the process through other investors in xAI), then to SpaceX, which could finally wash it clean through its IPO, without Musk having to sell a bunch of his Tesla shares and crashing their price.
BTW: since both are controlled by the same person, what's there to be “in talks” for?
GPT-5.3-Codex and Claude Opus 4.6: Incremental but Notable Updates
OpenAI GPT-5.3-Codex and Anthropic Claude Opus 4.6 are here.
Codex is coding-optimized version of GPT-5.3: the announcement post showcases a (rudimentary) 3D car chase game built completely by Codex.
Opus 4.6 is an incremental update to the general purpose Opus model: major improvements is context window size (1M tokens, up from 256K) and tweakable effort (low,normal,high,max). Both the large context window and max effort are currently only available via the API (not in Claude Code).
My initial impressions are both are incremental updates over existing models. We'll see if there are any noticable improvements in long coding sessions (especially with 1M context) in the following weeks.
Since all new models can knock out a fairly good Minesweeper clone, this year I'm upping the stakes for my coding tests: the task is to create a minimalistic version of a Real Time Strategy (RTS) game – think WarCraft, StartCraft or C&C.
No combat, enemy AI (heh...) or scenario objectives yet ... but it's a good start!

Left: Codex 5.3; Right: Opus 4.6
A Guide to Agentic Programming for Skeptics
This post details how Mitchell (creator of Vagrant, Terraform, and Ghostty) went from not really being impressed with AI coding performance, to using it constantly as a no-brainer.
If you're a software engineer and have doubts of usefulness of using AI in your coding workflow, the post doubles as a good, zero-hype guide how to try it out and what to expect.
tl;dr: it takes effort and willingless to “waste” time until you get proficient and find a sweet spot.
Letting Claude Read My Email — and Trying to Prompt-Inject Myself

I won't be installing OpenClaw any time soon, but I did let Claude read my email, just to see what would happen:
While everyone is focusing on OpenClaw, a viral “yolo” bot, the underlying magic is largely due to a bunch of useful tools allowing you to connect to your data and communication channels. Many of these are implemented as simple command-line utilities (since modern LLMs can use them really effectively, better than MCPs).
One of these is https://gogcli.sh/ (open source, written by the OpenClaw author), a CLI client for google apps (gmail, docs, calendar, ...). This is useful by itself, as can be used in various scripts or other custom automation without messing around with OAuth.
Once I got this set up, I wanted to see how (regular) Claude Code could use it, and indeed it works pretty well in that setup. Of course, it's not very useful if I have to tell Claude to check for email or send a message (a few clicks in the browser would do that faster), but opens more room for careful tinkering, without going full-in with OpenClaw.
The next thing I tried was to prompt-inject myself! I tried to get Claude to interpret the instructions in the email instead of just summarizing it to me. It didn't work! At least for this most basic prompt injection attack, Claude was clever enough to spot and ignore it. You can see the result in the screenshot below.
This should not be taken as proof that LLMs are immune to prompt injection attacks: if I tried a bit harder I might have constructed one that worked. But they're not as trivially susceptible as one may believe.
Fun times!
AI-Assisted Coding Talk Now on YouTube
A few months ago I held a talk on AI-assisted coding at a few venues. In one slide I tongue-in-cheek added a “best-before” date: January 2026. Turns out, that date was pretty spot-on.
The bleeding edge has shifted so much in the past three months that half of the talk might already be obsolete. What's not obsolete is the quest for quality, accountability and open-minded exploration of new tools.
I've just posted a recorded version to YouTube.
I've been talking to a lot of folks lately on how they're adapting the newest AI capabilities in their software development teams and will probably have a major update in a few weeks. If you're interested to hear me talk about it at your meetup or company, let me know.
A $46K Vercel Bill That Could Have Been $100 on Hetzner

This could've been a $100 Hetzner bill:
This is static data. No writes. Trivially cacheable.
AI Coding and the Risk of Overwork
“The AI vampire” by Steve Yegge chimes with my own feelings: it's very easy to get overworked using AI.
Steve's got a flamboyant writing style and his Gas Town approach to future of software engineering is controversial to say the least, but I think there's something to the problems he's describing.
Namely: AI coding pulls you in, you feel like you've got superpowers and could (vibe)code for hours on end. The feeling's been well documented by others like Armin Ronacher (whom I referenced a few times before) and Peter Steinberger (creator of Clawdbot/OpenClaw).
“Just one more prompt...” is an irresistible siren's call, but burning candles on both ends isn't sustainable and will surely result in burnout. This isn't healthy for an individual, and isn't productive for their employer.
I share Steve's fear that productivity expectations will rise throughout the industry, increasing pressure from the top. We're already flooded with 996 and war stories founders and employees pushing themselves far beyond the limits. Crunch death marches used to be a staple of game dev studios, now they're almost a badge of honor in startup circles. AI-assisted coding just adds the fuel to this fire.
In my own experience, as I get older I find that after 6 hours of focused work my mind turns to mush and I'm a zombie for the rest of the day (except I drawl “chocolateee” instead of “brainzzz”). Sure I can “sprint” more – for a couple of days, or weeks. Not for months, not permanently, not without serious metal, emotional, and physical consequences.
With AI doing many of the routine tasks and driving just one agent, I can now breeze past my limit with no sweat – it's better than caffeine! But we're encouraged, motivated and (soon?) expected to multitask. What happens when managing 10 agents in parallel becomes just “meets expectations”?
All of which is to say – I agree with Steve that maybe we should take this productivity improvement opportunity to reflect how we're spending our time, remove our foot from the gas pedal, and – at the risk of sounding too European – dial back a bit.
Analyzing curl|bash Installers with LLMs
I'm dismayed by the normalization of curl | bash pattern for software installers. This downloads and executes an installer from the internet and entails a bunch of risks:
- trust that the script author is not malicious
- trust that the installer hasn't been hacked
- hope you didn't misspell anything and executed a script from a phishing site
- trust that there's no man-in-the-middle attacks
- trust that the author's script will work well with your system and won't screw up anything
This gives a random piece of software from the internet the same level of trust you give to verified packages from your OS provider (eg. Debian packages), with no sandboxing (like Flatpak, Snap, or others).
In most cases, it's fine, you're installing software you trust anyways – but this holds true to most insecure practices! Telnet, HTTP, unencrypted passwords are mostly fine – until they aren't.
What's a dev to do? In lieu of pouring through hundreds of lines of creative Bash scripting myself, I created curl-bash-explain.dev, a handy tool for analyzing the script using LLMs.
For example, here's a step-by-step breakdown of what Claude Code installer is doing.
Anthropic's $30B Funding Round and the AI Revenue Race

Anthropic has just closed new funding round ($30B investment on $380B valuation). What's more interesting is revenue growth, over 10x in the each of past three years, brigning them to ~$14B ARR today.
In comparison, OpenAI recently announced they've passed $20B ARR mark in 2025 (I would guest, near the end of the year). A year earlier Anthropic did $1B ARR, and OpenAI $6B ARR.
A few thoughts:
- Anthropic hasn't caught up to OpenAI completely yet, but has shown to be capable competitor – it's not a one-horse race any more
- ChatGPT seems to have more mindshare among non-techies, while Claude rules for devs: although that might change with Codex (from OpenAI) and Claude Cowork (Anthropic) becoming incresingly capable
- OpenAI rolls out ads; Anthropic takes shots at them, so I'd expect no ads on Claude for the time being
- These are revenue numbers (and forward looking ARRs, not even TTMs, to boot); nobody talks expenses
Are we in a bubble? Will it burst? When? Dunno.
All I know is, of those ARR figures, $1200 to Anthropic and $240 to OpenAI is from my pocket, and I don't expect that to decrease.
CroAI Code Club Meetup Recap

Had a great time yesterday at the inaugural CroAI (Croatian AI Association) Code Club meetup!
Matija Stepanić and me ostensibly presented live vibe coding, but we also (intentionally) opened the floor for discussion from the get go, and heard many points of view, experiences, tips and tricks from the participants. We've learned as much from the audience as they from us!
Some takeaways:
- use Project feature from Claude (or ChatGPT) for planning the project before you switch to coding
- Matija intentionally lowered his Claude plan to check usage limits – tokens burn up really fast! this motivated a discussion about performance, quality and price tradeoffs for different models
- I demoed Claude Code use to explain and work on a codebase I know nothing about (project is in Flutter and uses Firebase and I'm not familiar with either)
- there are no best practices yet – we've heard so many different use cases! we're all collectively trying to figure this new agentic engineering thing
Big thanks for Valentina Zadrija for organizing everything and inviting us, Herman Zvonimir Došilović for kickstarting the whole thing, Matija for leading the charge yesterday, Wana Kiiru and the CroAI crew for the logistics, and everyone who came and participated.
Overall, great time, learned a lot, the feedback was very positive and multiple people stepped up to propose a talk at a future Code Club – already looking forward to the second one!
“You're Prompting It Wrong”: A Story, a Challenge, and an Offer
“You're prompting it wrong”: a story, a challenge, and an offer.
Recently I talked with a friend about the difficulties of automatically evaluating AI agents and he shared an example task for the agent: “go to website xyz, fetch the newest articles, and send the summaries of the top 5 most interesting ones to my mail”.
I pointed out a few problems with the prompt:
- this is a straightforward deterministic procedure and LLMs are non-deterministic – it's better to ask AI to write a script to do this, than hope it will always properly adhere to the steps
- “most interesting” is a judgement call, LLMs are notoriously bad at this, you'll get a biased random result instead – ask it to summarize every article and include in the email, and you will quickly determine what's interesting to you
This anecdote reminded me that the intuition of how and what to ask AI is something you have to practice to get. It looks easy but if you don't have the intuition, you'll get random quality results.
If you're bad at it due to lack of practice, it's easy to dismiss it as “AI is stupid” or “am I stupid?”. Neither. You just need practice and some guidance.
This brings me to my challenge and offer: if you're trying to get your AI to do something and it just seems dumb, yet feels like the modern ones should be able to do it, send me a DM and I'll debug it with you.
Let's see if we can improve your AI intuition!
(offer valid for problems you can explain in a couple of messages; for more complex matters I'm available for consultation:)