Senko Rašić

Random thoughts | about

Lately I’ve been thinking about the cloud platforms, SaaS, AI, scraping, costs, and the gradual closing of the web.

At first glance these seem like separate topics, but they share a set of assumptions that have quietly become “best practices” in the industry. Assumptions about what’s professional, what’s safe, what’s scalable, and what’s supposedly too hard for individuals or small teams to do themselves.

While I'm not anti-cloud or anti-SaaS in general, I do have a feeling these tools are often used far beyond where they make sense, largely due to marketing pressure and fear. That overuse creates real downstream effects: higher costs, lock-in, fragile systems, and eventually people closing off their own sites and apps just to stay afloat.

Cloud platforms and the myth of “serious infrastructure”

There’s a widespread belief that building something “serious” online requires “serious” platforms, which usually means public cloud infrastructure: AWS, GCP, Azure. Anything else is treated as amateurish.

The argument usually goes like this:

  • Cloud gives you scalability, reliability, and failover.
  • You shouldn’t self-host or manage servers yourself.
  • Professionals use managed platforms.
  • You'll need to hire sysadmins anyways so it won't be cheaper.

This narrative sounds reasonable and may be true in some cases, but is off the mark in many real-world cases.

Cloud platforms are not set-and-forget. The cloud solutions are complex beasts that you can easily mis-configure if you don't know what you're doing. As a result, you still need expertise to design, audit, and maintain the system, and over time you’ll need changes, fixes, and migrations. Instead of Unix tools and config files, you use web consoles, access policies, managed services, logs, metrics, and alerts. That just shifts where the complexity lives lives, but the work and costs remain.

Cloud infrastructure also isn’t inherently more reliable. In recent months alone, AWS, Cloudflare, GitHub, and others have had significant outages. Shared platforms fail too, and when they do, failures are global. A small, well-understood system under your control is easier to reason about and recover.

Security follows a similar pattern. Cloud providers have professional teams, but they also represent high-value targets with enormous blast radii. A small, simple, well-patched server with a minimal setup is often simpler to secure and audit.

Cost is where the differences become unavoidable: following cloud “best practices” gets expensive fast. Compute is only the starting point, and then load balancers, replication, managed databases, traffic, logs, metrics, alerting, and add-ons all pile on. In practice, cloud setups are often an order of magnitude or more expensive, and they still require specialized knowledge.

The usual justification is that this avoids hiring a sysadmin. What actually happens is that you replace that role with an AWS, GCP or Azure consultant at the same. If you can learn cloud tooling deeply enough to manage it yourself, you can learn Linux administration.

This pattern isn’t new. In the 2000s, Linux and Apache were considered unprofessional compared to Windows servers or branded Unix systems. Postgres and MySQL were dismissed in favor of Oracle. Linux routers were seen as inferior to Cisco hardware. These were marketing narratives framed as best practices, not technical inevitabilities.

Cloud infrastructure follows the same trajectory.

The Anti-Not-Invented-Here syndrome

A second pattern shows up in how people approach application architecture.

Instead of building small, simple components, there’s an increasing tendency to default to SaaS platforms and heavyweight frameworks for problems that are already well understood and mostly straightforward.

A blog is a good example.

Someone wants to write a blog. They reach for React and Next.js. That leads to client-side rendering, which causes SEO issues, so server-side rendering gets added. A remotely exploitable vulnerability appears in Next.js, raising concerns about arbitrary code execution. Running it on a plain Linux host now feels risky. Containers sound safer, or better yet, deploying to Vercel so someone else handles it.

At first it’s free. Then traffic grows. Bots and AI scrapers arrive. Bills start increasing. Now the problem is framed as “evil scrapers.”

None of this was necessary.

A blog is static content. Static HTML generation has existed for decades. Hosting it on a cheap VPS or static hosting costs almost nothing. The attack surface is minimal, and traffic volume doesn’t matter.

Unnecessary complexity creates cascading requirements: more infrastructure, more security layers, more tooling. Eventually outsourcing does make sense, but only because the system was made hard to operate in the first place.

The same thing happens with databases. Instead of running Postgres, MySQL, or SQLite locally, people jump straight to platforms like Supabase or RDS. They’re convenient and feature-rich, but most projects use only a small subset. If you later rely on the advanced features, moving away becomes painful or impossible. Growth then turns into a recurring cost problem.

Authentication is another example. Auth is a solved problem with solid libraries and established patterns. The (solid) advice to avoid rolling your own cryptography has expanded into a blanket avoidance of auth entirely. Instead of teaching people what not to do, the default is outsourcing to third-party services. The complexity doesn’t disappear, it just becomes vendor-specific.

This shifts effort from general skills to lock-in.

Over time, SaaS subscriptions accumulate. Individually they seem minor, but together they add up, especially if you have unexpected usage spikes. Running your own product becomes expensive, which forces aggressive monetization, artificial limits, or scaling simply to justify the cost structure.

Scrapers and anti-scrapers

When infrastructure is expensive and usage-based, every request matters. Even serving a blog becomes a cost center. That’s where scrapers of any kind (but especially AI) become a visible problem.

Much of today’s hostility toward scraping is driven by real bills. If serving traffic costs money, unwanted traffic becomes something to block. Cloudflare rules, captchas, garbage responses: anything that reduces load.

With static files served at near-zero cost, this wouldn’t matter. Scraping would be irrelevant or even welcome. Scraping becomes a problem because of the underlying cost model.

The response to those costs is a gradual closing of the web.

Large platforms already operate this way. Facebook, chat platforms, and social networks allow data in while tightly controlling access out. Public content is often only accessible through proprietary interfaces, with limited or hostile APIs.

Individuals increasingly mirror this behavior. Access is blocked for everyone except Google or Bing for SEO reasons. Others are explicitly denied, not merely discouraged through robots.txt.

This dynamic strengthens incumbents. Building a competitor to Google today is limited by permissions, not technology. The web is increasingly crawlable only by the largest players.

The irony is that the actors people fear most (OpenAI, Facebook, Google, Anthropic) can easily bypass these barriers. They have the resources to do so. Smaller companies, researchers, and hobbyists do not. The web closes unevenly.

This creates a feedback loop: marketing-driven practices raise costs, higher costs incentivize restriction, and restriction concentrates power further.

DIY is an advantage

This trend is unlikely to reverse on its own. The incentives are strong, and the marketing is effective.

Understanding that you can operate systems yourself still matters. Hosting simple services and keeping systems boring and cheap are often the most robust choices available.

There's value is recognizing when you only need a small slice of what’s being sold. With AI-assisted coding, implementing that slice is easier than it’s ever been.

People who are comfortable one layer below the current fashion retain more options. They can decide when outsourcing makes sense and when it doesn’t.

Turns out Doing Things Yourself is a real competitive advantage, and it’s becoming rare. I hope it doesn't become extinct.

Increasingly I notice when I talk about AI with others, we often mean subtly different things.

When I'm talking about AI, I'm talking about the technology (LLMs, agentic systems, etc.) and the type of product one can build using that technology (chatbots, assistants, classifiers, process automation, etc.)

I now see that for many others, that's not the primay concern. Instead they think about a class of (consumer) products (ChatGPT, WhatsApp AI chatbot, MS Copilot) and/or the effects of people (mis?)using the technology (AI slop).

To illustrate: In a recent conversation I stated I believed the AI models will improve in the future. In my mind, an LLM improving is an objective fact: I can task it with something and get better result.

The reply was “in which direction, and for whom?” which is a (totally fair) product and business strategy issue – not a technical one. We indeed might have much better LLMs within much shittier chatbots!

Another conversation from a few weeks ago was about AI serving porn. LLMs are gigantic autocomplete machines, they'll serve whatever you want them to serve (to a first approximation). From a technical standpoint, that's a complete non-issue.

Looking at AI as a class of products increasingly used by everyone, including our children, and including people who are not tech (or AI) savvy – what, how, and why the product makers choose to serve IS very important!

This is similar to “social networks”. What we today call “social networks” are everyhing but – a far cry from the “web2.0” social networks era, when the point was to interact with your (real) social circle. The name stuck, the damage is still being assessed (witness recent Australia ban for social networks for under 16s), but the “social network” aspect is not the problem: the hyper-optimized engagement machine peddling all sorts of questionable stuff is.

Sadly, I think the meme battle is already lost: AI is increasingly being understood as a class of products. The problem is when this mixup causes people to blame the technology for the questionable business and product strategies that big tech companies use to maximize shareholder value.

AI, the technology, is now blamed for layoffs, struggling artists, and slop, to name a few. In fact, companies have a great scapegoat for layoffs, Disney just made $1B from AI (none of which will go to struggling artists), and people have been hand-crafting slop in the name of SEO for years.

I don't have an answer. This post is mainly for my friends, to explain that when I'm bullish about AI, I'm bullish about the underlying technology. I'm not bullish about the way big tech is going to (ab)use it to maximize profit.

Hacker ethic is about using tech in inventive ways to improve people's lives. I believe AI has the potential to do so. Sadly, I know it will also be used by those of “Greed is good” ethic.

I hope when we direct our critique, we aim at the right target.

When you submit a pull-request, you accept full responsibility for the code you're submitting.

This comes up so often in conversation I am amazed that I need to spell this out at all – but here it is.

It doesn't matter if you vibe-coded, used AI-autocomplete, copy-pasted from Stack Overflow or from some other project, or if you asked your aunt to help you. By hitting that “Create PR” (or equivalent) button, you attest that you fully understand what the code is doing and that you have legal rights to submit it (ie. you're not stealing).

If I'm reviewing a PR, or any production code, and the author has no idea how or why it works, it's a red flag. Worse if the author hides the fact that they have used AI, Stack Overflow, or subcontracted someone on Upwork. In my book, that's serious and unacceptable breach of professional conduct.

Note that there are situations where it's perfectly fine to have bunch of spaghetti slapped together with duct-tape: spikes, prototypes, quick throwaway code or low-impact internal tool. Wanna vibe-code that new app screen as a functional mockup? Knock your socks off!

The required quality of the code, and understanding of the details and effects, is unrelated to the tool used to create said code. I would expect any developer (except perhaps the most junior novices – they first need to learn this) to understand how much care they need to put in.

You can't abdicate your responsibility for the code. “The AI wrote it” carries the same weight as “the dog ate my homework.”

What if Pull Requests weren't linear?

Saša Jurić has a wonderful talk titled “Tell me a Story”, described as a “three-act monodrama that explores the quiet art of narrative structure in collaborative software work”. It's not available online yet, but if you have a chance to see it in-person, you should!

I won't spoil the theme here, but the talk in part touches on topic of code reviews using git and GitHub (or similar system, like GitLab): in a nutshell, how do you present and keep a coherent story of your changes for the reviewers.

For example: nice commit history

While I share the ideals, the reality starts to get messy when review feedback commits get added to the PR:

it's downhill from here

Saša does address that somewhat, but I still feel it's messy, cumbersome and tedious to deal with:

  1. You can merge the reviewed branch as-is, leaving the mess in your git history
  2. You can amend (fixup) individiual commits as per the feedback, which will produce different commits and reviewers will need to re-review everything on GitHub
  3. You can amend the commits after the PR is approved, cleaning up the branch before merging. If the feedback and rework is extensive, this can be hard to do and error-prone.
  4. You can squash everything while merging, which won't leave any trace of the discussion (Saša argues strongly against this)

I admit I often do 4 just because it's the easiest, and because I treat a branch as the atomic unit of work (once finished), but it does lead to large commits and often unrelated things creep in.

I resigned to live in this messy reality until I stumbled on Code Review Can Be Better post by Alex Kladov, which introduced me to interdiff reviews.

Interdiff reviews are a way to have your cake and eat it too:

  1. amend the relevant commits
  2. push everything as a separate series of commits
  3. the reviewer reviews the difference between the original series of commits and the new one
  4. repeat as needed
  5. merge the final reviewed series of commits, discard the intermediate ones

Intuitively this sounds like a good approach: each original commit is fixed where needed, the fixes are all reviewed, and the resulting git history looks nice and is easy to navigate.

This is all natively supported by git with the range-diff command.

Assuming the PR is made off of main branch, new-feature is the original PR branch and new-feature-v2 is fixed after the PR feedback: git range-diff main new-feature new-feature-v1 shows the changes between those two branches, per-commit.

A big downside of this approach is complete lack of support from Github, Gitlab, and other similar services. Given that many developers use one of these for PR reviews, it's hard to implement.

However, moving code reviews to local machine (perhaps helped with git worktree to avoid the pain of WIP branch changes) could allow curious teams to experiement and perhaps adopt the workflow.

One of the related challenges is where to hold the comments (discussion). The Code Review Can Be Better post mentions placing the review comments as code comments (the rationale is the comment should live next to the code that's being commented) and links to a very interesting talk on how Jane Street does code review.

I find the idea of PR-comments-in-code-comments intriguing: ideally, it does make sense! I don't think we're anywhere near there with the tooling though (Jane Street built their own).

I do want to explore the branch less traveled, though!

I've recently installed a new laptop, which was an opportunity for me to revisit and revise the default software I usually install on any new workstation I set up (I use the term “workstation” here to mean a laptop or desktop machine that I can comfortably use in my daily work — software development in web/backend, AI, audio/video/streaming and related areas).

Here's my latest setup, roughly in the order of installation:

Debian / Ubuntu

I prefer using a Debian-based distribution, ideally Debian stable if all the hardware is supported. Right now Debian 12 (current stable) is pretty old and doesn't support the latest (Meteor Lake) hardware. Instead of mucking around with testing or unstable (which can be fun, but the fun can strike at inopportune times), I've installed the latest Ubuntu (24.10) which supports almost everything out of the box (Linux users will not be surprised to heard I had to tweak some driver options to get suspend/resume working).

The OS installation is pretty standard, the only non-default option I pick is to encrypt the whole disk. Having a non-encrypted disk on a device that can be easily stolen is a no-go for me. On the desktop workstation, I also set up SSH so I can remotely access it from elsewhere, but don't open any ports on the router.

I don't like Ubuntu's tweaks to the GNOME desktop and Snap packages, so if installing Ubuntu I do remove those. In general, I prefer setting up the apt repositories for 3rd party packages (getting auto-updates and all the other apt goodies). If the app doesn't have a repository but has a .deb package, I'll install that. I can also live with flatpak packages, and as a last resort, I'll manually install the app into /opt/<app> (if it's a GUI app or has many files) or /usr/local/bin (if it's a single binary).

GNOME

GNOME is a pretty opinionated piece of software. The developers' particular set of opinions resonates with me and I've been a very happy user for the past 20 years or so. I prefer vanilla GNOME interface (ie. without Ubuntu tweaks) and only need minimal customizations (mostly a few key shortcuts).

I do tend to only use the basic system and utilities. Not because I don't want to use various GNOME apps, but they just don't fit in my preferred workflow much.

1Password

I keep all my passwords in 1p, so it is the first thing that gets installed on a new machine after the OS is installed. It's very easy to set up – install via their apt repo, scan the QR code on my mobile 1Password app, and it's all there.

I also keep my SSH private keys in 1Password and set it up as the ssh agent. This way, I only need to unlock 1Password to unlock my SSH keys.

Dropbox

The next app is Dropbox. I'm a paid user and keep everything important (documents, company and personal documents, some media files, etc.) there. I also use Dropbox to auto-upload my mobile photos and videos, and symlink Pictures/, Videos/, and Music/ to the respective Dropbox folders. Another useful option I use is to scan documents, bills, etc. with my mobile and have them auto-uploaded to Dropbox. Though the quality is not the same as with a proper scanner, it's good enough for most purposes.

I don't keep my code, or the dotfiles/settings in Dropbox.

Firefox / Firefox Dev Edition

I use Firefox as my main browser, and a separate installation of Firefox Developer Edition for development work. Although Firefox is already available on both Ubuntu (via Snap) and Debian (Firefox ESR), I remove those and install the latest version directly from Mozilla's apt repositories.

I love the Multi-Account Containers feature/extension in Firefox. I also install UBlock Origin (ad/tracking blocker), 1Password (1p integration) and Kagi (search engine) extensions. I use (and pay for) Kagi as my search engine, and I'm very happy with it.

Many of the services I use daily are web-based (Fastmail for private mail, Google apps for work, GitHub,)

Tailscale

I have a personal VPN provisioned with Tailscale. Setup involves installing and enabling tailscale client and logging in with my account. Once enabled, I can connect from my laptop to my desktop from anywhere without punching holes in my router or worrying about security.

Visual Studio Code

I use VSCode as my main editor, mostly for editing Python, JavaScript and Markdown files. I use the official Microsoft binary and immediately turn off all telemetry (hopefully all!).

I heavily use the Remote SSH feature of VSCode: most of my projects are located on my desktop. When on Laptop, I open them via Remote SSH, and since that goes through the Tailscale VPN I can do this anywhere in the world. The SSH latency (for terminal work) can be a bit high when accessing from another continent though.

One thing I was worried when first setting this up is potential conflicts if the same project is opened locally (on desktop) and remotely (from laptop) at the same time, but I haven't had any issues with it.

VSCode has pretty good support for Python (including my linter/formatter of choice, ruff) and JavaScript. I also use GitHub Copilot, mostly as a smart auto-completion tool.

Obsidian

I love Obsidian, but keep it simple. I use it without any extra plugins or customizations – just a bunch of Markdown files in folders. I keep the data in Dropbox to get free sync with Dropsync on my phone.

CLI tools

My terminal app of choice is Tilix. It's fast, has tiling support, and integrates nicely with the rest of the GNOME terminal. I use Bash with minimal customizations (prompt, a few aliases and history settings).

I use vim for quick edits in terminal (no config to speak of – just syntax and smart indentation), ripgrep to search in files and fdfind to search files by name/extension. When connecting to a remote server I prefer screen (shows my age, I guess).

As a Python developer, I love the new ruff (a linter/formatter) and uv (package manager) tools so these get installed immediately (just drop them in /usr/local/bin).

I use git for version control, and my personal and work repos are hosted on GitHub. I don't use their CLI app though.

Media

OBS Studio

For any kind of screen recording or streaming, I use OBS Studio. Pretty vanilla setup, works great out of the box, I barely scratch the surface of its capabilities.

CLI tools

I use the command-line mpv for video playback, ffmpeg and friends for audio/video manipulation in the command line, and yt-dlp for downloading videos from YouTube (hey, that's not piracy, I'm a YT Premium subscriber!).

GIMP

If I need to do some image editing (cropping, resizing, adding text, minimal tweaking) I use GIMP. I'm not a graphic designer or a photographer, so GIMP is more than enough for my needs.

Spotify

I still have an old carefully-curated archive of MP3s somewhere, but these days I just use Spotify across all my devices. After a few years of use I've favorited enough of the songs I like so that it recommendations are mostly on point. Not everything is there, and for that I use YouTube. I also like the fact that I can just download all my liked songs for offline use (eg. when outside wifi and mobile coverage, on a plane, or roaming).

Online meetings

This depends on what I'm working on and with whom, but some combination of Slack, Zoom, Google Meet, and Discord (ideally all in the browser whenever possible).


This list is not exhaustive, but it covers the apps I use (almost) daily and will invariably need on a computer. All of them also have very good alternatives, so the list is highly subjective – it contains the tools I prefer and that work well in my personal workflow. Each time I (re)install a workstation there's some tweaking, but these are the ones that I keep coming back to.

Conventional wisdom these days, especially for startups, is to design your software architecture to be horizontally scalable and highly available. Best practices involve Kubernetes, microservices, multi-availability zones at a hyperscaler, zero-downtime deployments, database sharding and replication, with possibly a bit of serverless thrown in for good measure.

While scalability and high availability are certainly required for some, the solutions are often being recommended as a panacea. Don't you want to be ready for that hockey-stick growth? Do you want to lose millions in sales when your site is down for a few hours? Nobody got fired for choosing Kubernetes¹, so why take the chance?

In ensuing flame wars on Hacker News and elsewhere, this camp and the opposing You Ain't Gonna Need It (YAGNI) folks talk past each other, exchange anecdata and blame resume-padding architecture astronauts for building it wrong.

A hidden assumption in many of these discussions is that higher availability is always better. Is it, though?

Everything else being equal, having a highly-available system is better than having one that has poor reliability, in the same way as having fewer bugs in your code is better than having more bugs in your code. Everything else is not the same, though.

Approaches to increase code quality like PR reviews (maybe from multiple people), high test coverage (maybe with a combination of statement-level and branch-level unit tests, integration tests, functional tests and manual testing) increase cost proportionally to the effort put in. A balance is achieved when the expected cost of a bug (in terms of money, stress, damage, etc.) is equal to the additional cost incurred to try and avoid the bugs.

You can't add a bit of Kubernetes, though. The decisions about horizontal scalability and high availability influence the entire application architecture (whichever way you choose) and are hard to change later. The additional architecture and ops complexity, as well as the additional platform price to support it, goes up much easier than down.

Faced with this dilemma, it pays to first understand how much availability we really need, and how quickly we will need to scale up if needed. This is going to be specific to each project, so let me share two examples from my personal experience:

At my previous startup, AWW, our product was a shared online whiteboard – share a link and draw together. It was used by millions of people worldwide, across all timezones, and the real-time nature meant it had to be up pretty much all the time to be usable. If you have a meeting or a tutoring session at 2PM and are using AWW, it better be working at that time! One stressful episode involved scheduled downtime on an early Sunday morning European time and getting angry emails from paying customers in India who couldn't tutor their students on the Saturday evening.

Clearly, for AWW the higher the availability, the better. During COVID, we also experienced the proverbial hockey stick growth, servers were constantly “on fire” and most of the tech work included keeping up with the demand. A lot of complexity was introduced, a lot of time and money was spent on having a system that's as reliable as it can be, and that we can scale.

On the other hand, at API Bakery, the product is a software scaffolding tool – describe your project, click a few buttons to configure it, get the source code and you're off to the races. It's a low-engagement product with very flexible time limits. If it's down, no biggie, you can always retry a bit later. It's also not such a volume product that we'd lose a bunch of sales if it's down for a few hours. Finally, it's not likely to start growing so fast that it couldn't be scaled up the traditional way (buy a few bigger servers) in a reasonable time frame (days). It would be foolish to spend nearly as much effort or money on making it scale.

When thinking about high availability and scalability needs of a system, I look at three questions (with example answers):

1) How much trouble would you be in if something bad happened:

  • low – nobody would notice
  • minor – mild annoyance to someone, they'd have to retry later; small revenue loss
  • major – pretty annoying to a lot of your users, they're likely to complain or ask for a refund; significant revenue loss
  • critical – everything's on fire, you can't even deal with the torrent of questions or complaints, incurring significant revenue and reputation loss
  • catastrophic – you're fired, your company goes under, or both

2) How often are you prepared to experience these events:

  • low – daily or weekly
  • minor – once per month
  • major – once or twice per year at most
  • critical – hopefully never?
  • catastrophic – definitely never!

3) What's the downtime for each severity:

  • low – 30s/day (AWW – we had auto-recovery so this was mostly invisible to users), 5min/day (API Bakery)
  • minor – 5min/day (AWW), 1h/day (API Bakery)
  • major – 1h/day (AWW), several hours outage (API Bakery)
  • critical – 4h+/day (AWW), several days outage (API Bakery)
  • catastrophic – 2+ days (AWW), a few weeks outage (API Bakery)

These are example answers to give you intuition about thinking in terms of expected cost. In this case, it's obvious that the availability and scalability needs of AWW and API Bakery are wildly different.

Quantifying the costs of (not) implementing some architecture or infrastructure decision is harder, and also depends on the experience and skill set of people involved. Personally, for me it's much easier to whip up a VPS with a Django app, PostgreSQL database server, Caddy web server, with auto-backups, than it is to muck around with Helmfiles, configuring K8s ingress and getting the autoscaling to work, but I know there are people who feel exactly the opposite.

When quantifying the cost, I think about:

  • is this something we already know how to do?
  • if not, does it make sense to try and learn about it (appreciating the fact that we will certainly do a substandard job while we're learning)
  • can we engage outside experts, and will we be dependent on them if we do?
  • what are the infrastructure costs, and how easy is it to scale them up or down?
  • how will the added complexity impact the ongoing development, growth and maintenance/ops of the system?
  • how far can we push current/planned architecture and what would changing the approach entail?

We might not be able to get perfect answers to all these questions, but we will be better informed and base the decision on our specific situation, not cargo-culting “best practices” invented or promoted by organizations in a wildly different position.


¹ I'm not hating on Kubernetes or containers in general here. Those are just currently the most common solutions people recommend for failover and scaling.

The post-pandemic return-to-office (RTO) mandates are still the subjects of intense debate about working in the office or remotely, with two sides often talking past each other. Can there even be a solution that would satisfy both sides? [0]

The setup

Assuming rational knowledge workers Alice and Bob with clear preferences, and a rational manager, Michael, let's think through their preferences, constraints, problems that arise and potential solutions.

Let's assume Alice and Bob both live in the same city (or within daily-commute distance) and work for Acme, Inc., a successful tech company whose CEO Michael wants to do the best for all the stakeholders (the employees, the shareholders, and the consumers). Both Alice and Bob are highly-skilled, passionate about the company's mission, happy with their paychecks and go along well with their coworkers.

Alice prefers working in an office every day. She has short bike commute through a nice neighborhood, works on tasks where constant team collaboration is important, is energized by teamwork and enjoys the company of her colleagues. She lives in a tiny apartment with no room for a nice home-office setup, and has noisy neighbors to boot. Alice loves her work-life balance, is happy that the work is confined to the office, and has no need or desire to read or respond to emails in the evenings.

Bob's got an hour-long commute each way through rush hour traffic. Working in an open-space office, he's constantly interrupted, either explicitly by a coworker or accidentally by someone chatting or just something that catches his eye ever so often. Bob's work requires him to have deep focus for long stretches of time, so interruptions stress him out and make him less productive. He copes by arriving earlier or staying late to avoid rush hour and crowded office and by using noise-cancelling headphones. At home (in a quiet neighborhood), he's set up a cozy home office in the spare bedroom.

Michael knows that the best way for the customers to love Acme Inc products is to have high-quality people like Alice and Bob be happy and productive. He does need to keep an eye on expenses though, and that seldom-used big office space Acme rented on a long lease just before Covid is a constant thorn in the eye of Acme shareholders and board, and they're giving him grief for underutilizing it. Having risen from the trenches, Michael knows that both effective communication and focus time are important for the overall success of Acme, and is aware of the related challenges in both office and remote setups.

How should Michael organize the work so that both Alice and Bob are happy and that Acme continues to be successful?

The first choice that Michael faces is between finding an office/remote balance, or going all-in and convert Acme to a fully remote and geographically distributed company, with no offices and employees across multiple time zones. Aware that brings a whole other sets of challenges[1], Michael leaves that option to a future thought experiments, and chooses to find an hybrid office/remote setup.

(All-in) One way or another

Can Michael just choose one and make the choice palatable to the person preferring the other?

If he chooses full return-to-office, Alice is happy and Bob is miserable. Michael could pay Bob more so he relocates to somewhere closer, but Bob really likes his house. He could provide a separate office, with a real door, and get an earful (or worse!) from the shareholders about the inefficient use of office space. He could try to enforce quiet office (library style) and schedule meetings so that Bob maximizes focus time, but with hundreds of people with calendars to sync he knows that's a losing game. Also, Alice would be less than happy for Bob to get paid more, or have a separate office, or for her calendar to get whacked, just because Bob has different preferences. And Bob might be okay with the arrangement, but he'd still prefer his nice house and cozy office if he had a choice.

If Michael chooses “work from home” for everyone, Bob is happy and Alice is miserable. Michael could pay Alice more to improve her home-office setup or get a bigger apartment, but that might come with a longer commute, worse neighborhood, or both. Now she'd need to wear noise cancelling headphones the entire day and spend hours on Zoom. Worse, since everyone works from home, people naturally take advantage of extreme flexibility of working hours, and as a consequence there's always someone working and pinging her throughout the day[2], wreaking havoc on her work-life balance. In this scenario, Michael also gets pressured into getting out of that expensive office lease, as well as tapping remote talent. In due time Alice is laid off (too expensive! doesn't work well in a remote setting!), workers get hired from remote locations (cost benefits!) and Acme transition into fully distributed company.

Free-for-all

Suppose Michael chooses “work from wherever”. Bob can work fully from home, Alice comes to the office every day, and for a time all seems good.

Yet Alice is frustrated by half her team being away and still needing to hop on Zoom calls multiple times a day. They also now take advantage of their flexible schedule, so some work evenings. As a consequence, communication is worse, slower, and Alice needs to stay in the office longer as some meetings are pushed to later in the day. She tries to schedule “sync” days where everyone would be in the office, but her WFH coworkers have different preferences for when that'd be. And anyways it's a half-measure, because ideally their team should touch base every day.

At first, Bob is happy. However, constant Zoom calls to “sync up” and people trying to get him to come to office for meetings bug him a bit. It looks like his coworkers are unable to properly convey their thoughts in text or other async-friendly way and want to have a “quick call” for things that could have been 2 Slack messages. To make matters worse, the in-office crowd is all in the same room for the conference calls, he can barely hear half of them and has less chance to jump in the discussion due to network lags, so he becomes less involved and more passive on such calls. As time goes on, Bob also has a nagging suspicion that his manager has less insight into his challenges or accomplishments than of his in-office coworkers, and that his recent lack of promotion/rise might be connected. This deadens his passion for the mission of Acme, Inc. – it becomes just another job.

Mandatory office days

Trying to balance flexibility and communication needs, Michael instead chooses “X days in the office, 5-X days work-from-home” policy. There's two ways he can go about doing that.

The first option is to fix the days. For example, everyone must come to the office on Mondays, Wednesdays and Fridays, and has an option to work either in-office or from home on Tuesdays and Thursdays. This allows him to mandate that all meetings must happen on Mon/Wed/Fri, since everyone's physically present on those days.This makes Alice happy because she can continue to go to the office every day, and just needs to schedule her meetings on the “in-office” days. While it's better for Bob than going to the office every day, he's not terribly happy because of the obligations in his private life he'd really prefer having those days more flexible and going to the office on Tue/Thu instead. Also, since there's still the same number of meetings but now two less days to use, Mon/Wed/Fri are more packed, leaving smaller blocks of focus time, making him totally unproductive for focused work and effectively halving his productivity.

Finally, this setup makes the Acme Inc. shareholders angry, since there's no way to downsize the office. It still needs the same capacity but sits half-empty half of the week.

The second option is to have everyone choose the days they'll come to the office. Alice comes in every day, while Bob picks the days he's most comfortable with. The trouble is, having everyone in each team agree on the days is nigh impossible. The office looks half-empty all of the time, every day they come to the office some different parts of their team is missing. As a result they resort to doing the meetings on Zoom (with half of the team from the conference room). At least once a week someone's mic stops working or they're near a construction site, turning meetings into “hello can you hear me now?” quagmire. On top of this, Acme, Inc. downsized the office (to the delight of shareholders who got a fat dividend) and introduced hot desking. Bob and Alice don't actually have a desk any more, and they need to reserve one each time they come to the office. Neither are happy with the situation.

Us vs. Them

Can Michael mandate “X days in the office, 5-X days from home” policy, and let each team fix their in-office days? Maybe.

Each person in each team might still have different preferences for office/home days, but in theory there's more chance that they might be more compatible, or make it easier to compromise. Since Michael can't simply organize Acme teams according to employee's WFH preferences, whether that works will be left to chance. So Alice might get stuck in a team that wants to be as remote as possible, or Bob in a team that's all in-office or everyone prefers exactly the days he doesn't. Furthermore, the communication between teams will be harder, since they'll be more isolated from each other. It becomes easier than ever to blame the other team for delays or misunderstandings. And if Bob or Alice are team leads or managers of their respective teams, they'll likely need to accommodate their schedule to be able to sync with the rest of the company.

If Michael can pull it off, gets lucky with employee preferences within each team, and can still ensure effective cross-team communication, this could work. That's a pretty big “if”, though.

One day to meet them all

Going back to the “X days in office” mandate, what if Michael chooses one day everyone must be in office, and mandates all meetings must be scheduled only on that day?

Alice can still come to the office each day. Since her work requires her to communicate with her team constantly, she's not too happy with one-day-for-meetings mandate. Her Meeting day is now packed full with back-to-back meetings, as is everyone's so there's a bunch of schedule conflicts. The office has nowhere near enough meeting rooms so groups huddle in different corners of bigger offices. Even if Alice loves teamwork, a full day of this is really exhausting and she wonders if the 2nd half was useful at all. Also, since her work requires the communication to happen more often, she must rely on Slack, mail and impromptu 1-1 calls throughout the week, slowing her down and frustrating her.

Bob is happier, since he now gets large uninterrupted blocks of time for the rest of the week. Yeah the full-day meetings are mind-numbing and he powers through, but at least there are no small chunks of time to waste in between. Also, much prefers the once-a-week commute to everyday commute. While it's not his preferred day, it's still better than having to go multiple times a week!

On the other hand, the shareholders are up with pitchforks. Acme. Inc pays through the nose for the office and they only use it one day per week? This puts enormous pressure on Michael to choose a different model. The fact that the office needs to be remodeled with more adequate meeting places adds to the pain. Michael also has a feeling that one touch point a week might be too infrequent, and is slowing his company down.

You choose, you lose

It seems like there's no winning combination. Whatever Michael chooses, someone will be unhappy.

Although Alice and Bob in this thought experiment embody the extremes, in any large company most people will have some mix of their traits and preferences, and there's no one solution that will satisfy everyone. This is before taking into account that the real executives might be less rational and will certainly have more constraints than our hypothetical Michael.

This realization should be the starting point of every RTO/WFH debate. Ideally, every organization would think through a similar thought experiment, identify realistic options and their consequences, taking people preferences and other constraints into accounts, and see how it can optimize its processes to make one of the options work well for it.

Fewer, more impactful meetings combined with more effective async communication might be one thing. Private per-team offices and allowances for setting up home-offices might be another. There are probably many more such tweaks an organization can make to make whichever choice more palatable to everyone.

I admit, in a cutthroat corporate capitalism world, this sounds like a pipe dream. But hey, one can dream, can't they?


[0] Why should knowledge workers deserve such a perk, when blue-collar, emergeny and many other workers working in worse, possibly dangerous, conditions, and for worse pay, can never get the same? As long as WHF for knowledge workers doesn't negatively impact the others, I don't believe that's a valid argument: people in every profession naturally try to get the best possible work conditions. Knowledge workers are no exception. “Why should you have the WFH option when they can't?” doesn't help at all to advance a (valid) quest to live in a more equal world.

[1] Collaboration across time zones, effective communication with team members from different cultures, wages for people in wildly different cost-of-living places, to name a few.

[2] Alice can always choose to ignore any pings after her working hours, but takes more mental effort and stress than not receiving them in the first place, even if nobody expects her to answer (which, if hours are entirely flexible for everyone else, is by itself a tall order). And that's even before we consider weekends.

Long time ago, information used to be hard to find. You had to go out there and look for it. Nowadays, a lot of information is at our fingertips at any moment.

Now we have the opposite problem: too much information. The challenge today is curation: we want to consume a reasonable amount of information that's relevant to us.

What's “reasonable” and what's “relevant” is tough to define as it is person-specific. What's reasonable and relevant to you may not be the same as what's reasonable and relevant to me.

From curation ...

Curation of information is nothing new: newspapers, radio and TV have done it since the beginning. We relied on the media to pick what's most important and only kept the choice of a source we trusted or liked.

As the Internet grew, the amount of information grew exponentially. One of the reasons for Google's early success is that it was very good at curating this information. While the other search engines could also find 10,000 pages matching your search, Google was the best at picking what was most useful, informative and relevant.

Other services were also trying their best at curation: Netflix had a great recommendation algorithm, and have for years organized a public competition to optimize it and provide even better recommendations.

.. to filter bubbles

Over the years, as the various algorithms got improved and optimized, they got subjectively worse for us, the users. Google is in arms-race with SEO folk, Facebook prefers to serve you content that will rile you up, Netflix' algorithm seems braindead and the less we talk about YouTube recommendations the better.

Why is this? Turns out that, at some point in their lifecycle, big companies had to choose between optimizing for the user experience and optimizing for revenue. The algorithms improve allright ... but using a different metric.

Google now hyper-optimizes for what it thinks you should see (filter bubble), Facebook serves you whatever will keep you on the site for longer (mind-numbing memes interspersed with viral outrages), and Netflix brings up and center its own content that it would like you to get hooked at (and not cancel the service), not the things you might actually want to see.

Curation for the benefit of the user, has turned to serving targeted content, for the benefit of the company.

Enshittification

Enshittification is an ugly word to describe an ugly thing. Basically, it is a strategy change in an online platform where it switches from user growth (where optimizing user experience is paramount) to revenue growth (where you need to squeeze maximum income from each user).

Enshittification, or platform decay, is not limited only to Internet media companies. But it is especially visible in online startups that grew big by giving away their product or undercharging for it, in order to grow as fast and big as they can.

At some point you got to earn money, and you have to earn as much money as possible, and since you have a mostly-captive audience (there is only one Google, Facebook, YouTube or Twitter), they won't leave if the user experience is marginally worse.

And so the UX frog gets slowly cooked. People today are hooked on Facebook scrolling, Twitter mob rages, memes everywhere and 10-second dopamine hits on TikTok or YouTube Shorts. It's the sugar thing all over again.

Information detox

Faced with this, some people try limiting their consumption of this kind of content (me included, see my Junk Social and Digital hygiene posts).

This can work if you're willing to limit your information access, don't suffer from the fear of missing out (FOMO) and have the mental strength to not succumb when you're tired, bored or just open up your favorite social media app on autopilot because it's on your mobile phone home screen.

But it's hard work: you're basically building and maintaining a Great Wall between yourself and most of the modern Internet.

Curation on our terms

Can we do it better? Is there a way to again outsource curation of content that is optimized for us, the users?

Curation is hard work, whether you want to build and maintain sophisticated algorithm or AI to do it, or if you have actual people doing the work. And it's not entirely obvious how that would work without degenerating into the filter bubbles and social media we have today – that's how some of those companies started, after all.

As a technologist I do believe a solution is within a realm of possiblity. As someone who's watched Internet grow from an academia and hobbyst garden into what it is today, I am skeptical we'll reach that solution (and if you really want to be scared about the prospects of it, go read The Master Switch).

A creative robot

(last edit: Mar 26, 2025)

Any sufficiently advanced technology is indistinguishable from magic.

Modern AIs and ChatGPT in particular look like magic to many people. This can lead to misunderstanding about their strengths and weaknesses, and a lot of unsubstantiated hype.

Learning about a technology is the best antidote. For curious computer scientists, software engineers and anyone else who isn't afraid of digging a bit deeper, I've compiled a list of useful resources on the topic.

This is basically my reading/watching list, organized from more fundamental or beginner friendly to the latest advances in the field. It's not exhaustive, but should give you (and me) enough knowledge to continue exploring and experimenting on your own.

General overviews

If you don't have a lot of time or don't know if you want to dedicate effort in learning the ins-and-outs of modern AIs, watch these first to give you a general overview:

Fundamentals of neural networks

The videos here provide both teorethical and hands-on introduction to the fundamentals of neural networks.

MIT Introduction to Deep Learning

A good theoretical intro is MIT's 6.S191 class lectures, especially the Introduction to Deep Learning and Recurrent Neural Networks, Transformers and Attention.

These overview lectures briefly introduce all the major elements and algorithms involved in creating and training neural networks. I don't think they works on their own (unless you're a student there, do all the in-class excercises, etc), but it's a good place to start with.

The topics discussed here will probably make your head spin and it won't be clear at all how to apply them in real life, but this will give you the lay of the land and prepare you for practical dive-in with, for example, Andrej's “Zero to Hero”.

The example code slides use TensorFlow. Since Andrej's course uses PyTorch, going through both sets of lectures will expose you to two most popular deep learning libraries.

Neural Networks: Zero to Hero

An awesome practical intro is the Neural Networks: Zero to Hero course by Andrej Karpathy (he also did the busy person's intro to LLMs linked above).

Andrej starts out slowly, by spelling out the computation involved in forward and backward passes of the neural network, and then gradually builds up to a single neuron, a single-layer network, multi-layer perceptron, deep networks and finally transformers (like GPT).

Throughout this, he introduces and uses tools like PyTorch (library for writing neural networks), and Jupyter Notebook, and Google Collab. Importantly, he first introduces and implements a concept manually, and only later switches to a PyTorch API that provides the same thing.

The only part where things look a bit rushed is the (currently) last – “Let's build GPT from scratch”. There's so much ground to cover there that Andrej skips over some parts (like the Adam optimization algorithm) and quickly goes over the others (self-attention, cross-attention).

The latest video in the series (not yet on the site as of this writing) is Let's Reproduce GPT-2, which can be also viewed standalone. In the video, he implements the full GPT-2 model (as described in the original paper), using PyTorch. By training on a newer, higher quality dataset, the model even approaches GPT-3 level of intelligence!

Overall a great guide. You only need to know the basics of Python, not be afraid of math (the heaviest of which is matrix multiplication which is spelled out), and do the excercises (code along the videos) without skipping the videos that don't seem exciting.

Understanding embeddings

Both the MIT and Andrej's lectures touch on embeddings (the way to turn words into numbers that a neural net can use) only lightly. To deepen your understanding, What Are Embeddings by Vicki Boykis will teach you everyhing (and I mean everything) about embeddings.

If you don't want to read an entire book but still dive deep, Illustrated word2vec article explains word2vec, a popular word embedding algorithm, step by step. It also features a video explanation for those that prefer it to text.

Another good lecture on the topic is Understanding Word2vec.

CNNs, autoencoders and GANs

The MIT lectures mention earlier also contain lessons on Convolutional Neural Networks, autoencoders and GANs, which are important building blocks in neural networks used in vision.

Again, these are high level overviews and although formulas are present, the lectures more give an overview of the algorithms without going into too much detail. That makes them ideal prequel to the Practical Deep Learning course by Fast.ai.

Diffusion models

Diffusion models build on top of CNNs to create image-generating and manipulating AI models. Beyond the general overview linked earlier, the Introduction to Diffusion Models for Machine Learning is a deep dive into exactly how they work.

Coding Stable Diffusion from scratch in PyTorch is a hands-on implementation of the stable diffusion paper, similar in style to Karpathy's.

Practical Deep Learning

Practical Deep Learning is a free course by Fast.ai that has (current count) 25 lectures covering both high-level practical parts of neural networks and the underlying fundamentals.

In particular in Part 2 they cover “zero to hero” on Stable Diffusion, a powerful image-generation AI model.

Large Language Models

These resources go in-depth about constructing and using large language models (like GPT):

Transformers

Andrej's course goes over the transformer (building blocks of GPT) architecture, but the complexity makes it easy to get lost at first pass. To solidify your understanding of the topic, these two are super useful:

The Illustrated Transformer describes the transformer (building blocks of GPT) in detail while avoiding tedious math or programming details. It provides a good intuition into what's going on (and there's even an accompanying video you may want to watch as a gentler intro).

Follow that up with The Annotated Transformer which describes the scientific paper that introduced Transformers and implements it in PyTorch. Since it's 1:1 annotation of the paper, you need a lot of understanding already so only attempt going through this once you've watched both Andrej's course and once you've read and understood the Illustrated Transformer.

Reinforcement Learning through Human Feedback

Language models are good at predicting and generating text, which is different from answering questions or having a conversation. RLHF is used to fine tune the models to be able to communicate in this way.

The way RLHF works is that people score (a limited number of) outputs from the model. These outputs and scores are then used to train a separate “reward model”, which proxies for human judgement. The reward model is then used to train the LLM.

Illustrating Reinforcement Learning through Human Feedback from folks at HuggingFace (an open source AI community) provides a good overview of RLHF. They also did a webinar based on it (video is the complete webinar, link jumps directly to start of RLHF description) based on the blog post.

If you want to dive deeper, here's the InstructGPT paper from OpenAI, which basically describes the method they used to create ChatGPT out of GPT3 (InstructGPT was a research precursor to ChatGPT).

Fine-tuning

Fine-tuning allows us to refine or specialize an already (pre)-trained LLM to be better at a specific task (RLHF is one example).

Sebastian Rashka's Finetuning Large Language Models explains a few common aproaches to finetuning, with code examples using PyTorch. He follows that up with Understanding Parameter-Efficient LLM Finetuning, a blog post discussing ways to lower the number of parameters required, and an in-depth article about Parameter-Efficient LLM Finetuning with Low-Rank Adaptation (LoRA).

While fine-tuning excels at making the model behave differently, RAG (Retrieval-Augumented Generation) is often a better choice for giving additional information or context to the LLM. A good overview of use cases for both technologies is A Survey of Techniques for Maximizing LLM Performance by OpenAI.

Reasoning

Newer models like OpenAIs o1 and o3 and DeepSeek R1, are reasoning models. This means they're optimized to spend more time/tokens at inference time (while answering a question) to “think through” before answering.

Jay Alammar (from “Illustrated *” series) wrote a good overview of the R1 process at The Illustrated DeepSeek-R1. Notably, R1 uses Reinforcement Learning (RL), not RLHF (see above) – the feedback on LLM output is based on objective non-human factors (functions, heuristics, etc).

Another approach is s1: Simple test-time scaling (PDF) paper, which is pretty technical but the major point (forcing LLM to continue by injecting “wait” into its output) is simple and straightforward.

Check also Sebastian Raschka's Understanding Reasoning LLMs for an overview of the reasoning LLM methods.

Agents

Agents are LLM-driven components that can use tools (for example access the web) and communicate with other agents in the system to collaboratively solve a complex problem. Here's a good overview of agentic workflows by Andrew Ng, followed by a deeper dive by Harrison Chase (both videos from the Sequoia “AI Ascent” conference).

Building Effective Agents is a practical guide for building agents, workflows and other agentic AI patterns.

Non-LLM neural networks

Aside from the increasingly dominant Large Language Models, there are many specialized neural networks (some also using transformer architecture, or part of it) optimized for specific domains. Here are some of them:

Time series

Time series models attempt to undertand the patterns of numerical data series and forecast future ones. Typically the neural network is trained on a large number of publicly-available time series. At inference, the network is given time series as a context and outputs the forecast. Some models are base models and can be further fine-tuned to better model a specific domain.

Many modern time series models are transformer based:

A video overview of Chronos is also a good starting point to dive into the time series models in general. Decoding time series: the role of transformers in forecasting is a good overview with some benchmarks of the models.

Full courses

If you want a really deep dive (undergrad or higher level), follow these courses including doing the excercises / playing around with the code:


This is a living document (ie. it's a work in progress and always will be). Come back in a few weeks and check if there's anything new.

ChatGPT and other AIs are all the rage and I see many (mostly junior) programmers worrying if the market for devs is going to dry up.

You don't need to worry, and here's why.

ChatGPT is scarily good. It really is. No it's not better at web search than Google (yet), and it's nowhere near being sentient. But for the tasks where it makes sense, it's very good. So I'm not going to tell you that you don't need to worry because it's a useless tool.

ChatGPT (and other AIs) is a tool, in the same way a calculator is a tool or a compiler is a tool. The word “calculator” used to refer to people doing number crunching. My first calculator was a handheld device. Now it's just an app. Yes, calculators, the people, lost their “job” doing mind-numbing number crunching, but they were able to work on more interesting problems in math, physics, or what have you.

Same with compilers. When compilers were invented, people were furious that someone thought a machine could do a better job than an expert programmer at crafting assembly code. Today almost nobody writes directly in assembly, except in rare cases where that makes sense. But “assembler programmers” didn't lose their jobs, they just became “C programmers” or “Lisp programmers”.

It is the same with the new AI models. They are very effective on a whole other level than just number crunching or compiling software, but at the end of the day they're just tools, like programming languages, libraries or APIs.

If you've been a programmer for more than a few years, you know you always need to learn new stuff and stay on top. You'll need to invest some time to learn what ChatGPT and others can do, or can't. What you don't want to do is completely ignore the trend, or (equally bad) think it'll solve all your problems (or put you out of business). And as John Carmack said in a recent tweet keep your eyes on the delivered value and don't over focus on the specifics of the tools.

I've been a programmer for some 30 years (20 or so professionally) and my usual reaction to new stuff is “oh, so they're reinventing that particular wheel again”. Yet the current crop of AIs gets me really excited, like I was a kid again uncovering the vast potential of what you can do with a machine that you can order around! I've been playing with ChatGPT and it makes me faster and it makes programming (more) fun!

Not by writing my code – I don't use it like that because it does produce bugs and it can hallucinate stuff. I use it to explore (how I might go about doing X), and to quickly recall something I forgot (how a particular API or library function is used, for example). As Simon Willison (of Django and Datasette fame) puts it AI-enhanced development makes me more ambitious with my projects.

This is just a beginning, and we're just seeing a boom in integrating these AIs into everything else. Copilot, Bing search, are the big names but people are experimenting with integrations with anything under the sun (I made a service that creates API backends based on your project description, for example). Time will tell which of these will be truly useful, but I have no doubt there will be a lot of them.

AI is a tool, with limitations, but with a lot of potential. It would be a shame not to use it effectively. It won't put you out of a job, it will make your job better.