Markov Chains and Bad Ideas at 3am

I built a bot that told my friend to “just compile it yourself you absolute potato” in an IRC channel at 2am on a Tuesday. Nobody asked it to. It learned from six months of channel logs and decided that was a reasonable thing to say.

That’s how this whole thing started.

The Setup

In 2012, my homelab was a PowerEdge T310 with a Xeon X3430, 16GB of RAM, and Ubuntu 12.04, which felt like serious hardware at the time. It ran hot enough that my office was measurably warmer than the rest of the house. Redis, Postgres, and too many tmux sessions all lived on it.

The project was simple. Take everything people said in our IRC channel, feed it into a Markov chain, and have the bot occasionally say something back. Not a chatbot. Not anything smart. Just a statistical parrot with no sense of timing or appropriateness.

Python was the obvious choice. I’d been writing it for years and the ecosystem had everything I needed. irckit for the IRC connection, Redis for storing the chain data, and maybe 200 lines of actual logic. The whole thing fit in a single file.

How Markov Chains Actually Work

A Markov chain is glorified autocomplete with a very short memory. You take a body of text and break it into overlapping groups of words. For a chain length of two, the sentence “the server is on fire again” becomes:

["the", "server"] → "is"
["server", "is"] → "on"
["is", "on"] → "fire"
["on", "fire"] → "again"

Each pair of words maps to a set of possible next words. To generate text, you pick a starting pair, grab a random word from its set, slide forward one position, and repeat until you hit a stop token or a maximum length.

That’s it. No neural networks, no training epochs, no GPU. Just a dictionary lookup and a random number generator.

The trick is in the training data. Feed it enough text and the statistical patterns start producing things that feel like language. Feed it too little and you get word salad. Feed it too much from one source and you get verbatim quotes back.

Redis Made It Fast

I stored the chains in Redis sets. The key was the word pair joined by a delimiter (\x01), the value was a set of successor words. Redis’s SRANDMEMBER command did exactly what I needed: pull a random next word without dragging the whole set into memory. Fast, simple, survived restarts because Redis persisted to disk.

The bot sat in the channel and logged every message. Every line anyone typed got broken into word groups and added to the chain. Over weeks, the model got richer. Inside jokes, obscure references, people’s verbal tics, all of it absorbed into the probability tables.

The generation step tried multiple times and kept the longest result. A “chattiness” parameter controlled how often it spoke unprompted. Set too high and it was annoying. Set too low and people forgot it existed. I landed on about 1 in 50 messages. Often enough to be surprising, rare enough to not be noise.

The Twitter Detour

This was the era of weird Twitter bots. @horse_ebooks was still fooling people into thinking it was algorithmic. Darius Kazemi was building creative bots that felt genuinely novel. I pointed the same Markov implementation at a dump of Cult of the Dead Cow text files and changed the output target from IRC to the Twitter API.

the federal government has no idea how to handle a modem and neither do your parents

information wants to be free but first it wants to be weird

Most of it was garbage. But maybe one in twenty outputs landed in this uncanny valley where it sounded like a real cDc t-file that just never got distributed. Chain length mattered enormously. A chain length of one was basically randomness. Three or four gave near-verbatim quotes, which wasn’t interesting. Two was the sweet spot. Enough structure to parse, enough chaos to surprise.

I ran the Twitter bot for about three months before I got bored of curating the output. Every tweet needed a human check because the bot had no concept of what was funny versus what was just noise. The generation was cheap. The filtering was expensive and tedious.

What I Actually Learned

The thing that stuck with me wasn’t the Markov chains. They were old, simple, and thoroughly understood. What stuck was watching a system produce output that felt like it understood something when it absolutely didn’t.

The IRC bot would occasionally drop something so contextually perfect that people in the channel would do a double take. “Wait, was that the bot?” And it was. Not because it understood the conversation, but because the statistical patterns in six months of logs happened to line up with the current topic. Pure coincidence dressed up as comprehension. Two words predict the next word, and somehow that was enough to create an illusion of meaning.

The other thing it made obvious was how much the data mattered. The IRC bot sounded like our group because it was our group, statistically. The algorithm was generic. The data made it specific.

The Itch That Didn’t Go Away

I shut down the Twitter bot. The IRC bot kept running as a background curiosity, occasionally making people laugh, mostly ignored. I moved on to other projects.

But the question stayed. If a dictionary lookup and a random number generator could produce output that made people pause, even for a second, what would happen when the model stopped being this kind of dumb?

I didn’t have the hardware in 2012 to build anything more ambitious. What I had was a toy model, a hot server, and a question I couldn’t shake. That question stuck around longer than the bot did, and it ended up dragging me into everything that came after.