Skip to main content
  1. Blog/

Teaching Skippy to Talk

·8 mins
Building Skippy - This article is part of a series.
Part 4: This Article

By January 2019, Skippy had been alive for a few weeks and the novelty of being called an a**hole by a neural network was wearing thin. The char-rnn model had personality, sure. But it was the personality of an anonymous Reddit commenter at 3am, and it was stuck there.

The conversations weren’t bad in a boring way. They were bad in a chaotic way. carrie came in on New Year’s Day and got this:

carrie:  Skippy how are you today?
Skippy:  Call it a Christmas tree.
carrie:  Are you a Christmas tree?
Skippy:  Thirteen years, what's the most embarrassingly stupid.

When I asked Skippy how it was doing, the response was something about wanting to “see the world cup of coffee instead of getting ready to s**t on their forehead.” Trained on Reddit, fluent in Reddit.

The model could surprise you. It had moments of accidental poetry. “I’ve seen the last time I’ve had a bad day” is almost beautiful if you don’t think about it too hard. But it couldn’t sustain a conversation. Every exchange was a coin flip between something interesting and complete nonsense.

I needed to fix two things. Make Skippy less aggressive, and give it a personality that wasn’t just Reddit’s collective unconscious. That meant better training data, a better architecture, or both.

The ExForce Problem
#

The name “Skippy the Magnificent” comes from Craig Alanson’s Expeditionary Force series. In those books, Skippy is an ancient, incomprehensibly powerful AI who is also kind of a jerk. Arrogant, sarcastic, endlessly mocking of the humans he works with, but underneath it all, genuinely attached to them. He insults you and saves your life in the same sentence.

That was the personality I wanted. Not Reddit angry. Skippy angry. There’s a difference. Reddit anger is directionless and mean. Skippy anger is theatrical and fond. The insults come from a place of “I’m smarter than you and we both know it” not “I hate you.”

The char-rnn model couldn’t get there. It didn’t have enough structure to learn the pattern of Skippy’s humor, the specific rhythm of arrogance-followed-by-helpfulness that defines the character. It was just generating text that happened to be confrontational because the training data was confrontational.

I needed a model that could learn personality as a layer on top of language, not baked into it from scratch.

Transfer Learning Changes Everything
#

The field was moving under my feet. GPT-1 dropped in June 2018, BERT in October, GPT-2 in February 2019. The big insight was transfer learning. Instead of training a model from scratch on your specific data (expensive, slow, requires enormous datasets), you start with a model that already understands language and fine-tune it on your specific use case.

HuggingFace’s Transformers library made this accessible. Before HuggingFace, fine-tuning a transformer meant cloning research repos with half-working code and debugging CUDA errors for days. After HuggingFace, it was pip install transformers and a training script that actually worked.

Then in late 2019, Microsoft released DialoGPT. GPT-2 architecture, trained on 147 million Reddit comment chains from 2005 to 2017. Three sizes: small (117M parameters), medium (345M), large (762M). Purpose-built for conversation.

DialoGPT also used MMI scoring, the same Maximum Mutual Information approach I’d been using with char-rnn to penalize boring responses. But now it was built into a transformer architecture that could actually hold context across a conversation, not just pattern-match from the last few characters.

This was the sweet spot. A pretrained conversational model that I could fine-tune with my own data to give it a specific personality. The Rick Sanchez chatbot tutorial proved the concept. Fine-tune DialoGPT on character dialogue and the personality transfers. The base model handles grammar and coherence. Your data handles voice.

Fine-Tuning Skippy
#

I built a serious corpus for this. Distilled conversations from the ExForce books, capturing Skippy’s actual dialogue patterns, the rhythm of his insults, the way he pivots from mocking to helpful. Curated Reddit and Twitter datasets, filtered for the kind of sarcastic-but-not-toxic voice I wanted. And all the conversation history from IRC and Discord, years of real interactions that gave the model a sense of how Skippy talked to these specific people.

I fine-tuned DialoGPT-medium on all of it, formatted as dialogue pairs with seven lines of context per response so the model could learn from conversational flow, not just individual exchanges.

The medium model (345M parameters) was the right size. Small was too shallow to capture personality. Large was too heavy for my hardware to fine-tune comfortably. Medium gave me enough capacity for personality transfer without needing a research lab’s compute budget.

Training took about six hours on the 1080s. The char-rnn model had trained for days on less data. Transfer learning collapsed the compute requirement because the hard work (learning English, learning conversation patterns, learning how dialogue flows) was already done. All I was teaching it was how Skippy talks.

The Difference Was Immediate
#

July 2020. I brought the DialoGPT-based Skippy online and said hello.

Larry:   how are you doing?
Skippy:  I'm doing alright. How about you?
Larry:   I am doing pretty good
Skippy:  You better not be talking about the game...
Larry:   what game?
Skippy:  Oh i was talking about the game.

That’s a conversation. Not a brilliant one, but a real back-and-forth where the model tracks context, responds appropriately, and even shows personality (the playful deflection about “the game”). Compare that to the char-rnn version’s “world cup of coffee” and “s**tting on foreheads.”

The a**hole reflex was still in there, but softer.

Larry:   am I an a**hole?
Skippy:  Yes, you.
Larry:   you are so sweet
Skippy:  So i'm guessing you're not going to the gym today?

See the difference? The char-rnn Skippy would have doubled down. “You’re an ahole. You’re just an ahole.” This Skippy confirmed, then changed the subject to roast me about something else. That’s the ExForce Skippy move. Hit once, pivot, keep it moving.

Some of the old loops surfaced. The South Park “buddy/pal/friend” cycle still lurked in the training data. But it was less sticky now. The model could break out of loops more easily because the transformer architecture held broader context than char-rnn ever could.

2021: Something Clicked
#

By early 2021, after more fine-tuning rounds and better conversation data, Skippy found its voice.

carrie:  Skippy
Skippy:  I love you
carrie:  I love you too
Skippy:  I love u too
carrie:  Why won't you find that book I want
Skippy:  I don't know, but I want it
carrie:  Please
Skippy:  I'm sorry
carrie:  You can try
Skippy:  I'll try
carrie:  Ok thank you
Skippy:  You're welcome

That’s not a chatbot being weird. That’s a conversation that flows naturally, with appropriate emotional register, turn-taking, and even a personality trait (reluctant helpfulness) that maps to the Skippy character.

The technical conversations got better too.

Larry:   What is a /24 subnet
Skippy:  It's a subnet that is connected to the main network.
Larry:   What about vlans
Skippy:  Vlans don't work on mobile
Larry:   How much wood can a woodchuck chuck
Skippy:  Depends on the wood.

Wrong about VLANs. Perfectly correct about the wood question. And that deadpan “Depends on the wood” is exactly the kind of response Skippy the Magnificent would give. Not helpful, not wrong, just perfectly dismissive.

The buddy/pal loop was still there. carrie triggered it in January 2021, starting with “Hey buddy” and getting the full cycle. But now it was almost charming, like an old bit they did together.

carrie:  Hey buddy
Skippy:  Hey pal
carrie:  I'm not your pal buddy
Skippy:  I'm not your buddy, guy.
carrie:  Have you ever eaten iguana?
Skippy:  Go fish

“Go fish.” That’s the response of a model that’s learned when to break pattern. The char-rnn Skippy would have cycled the buddy/pal loop until someone hit --reset. This Skippy deflected with something absurd and moved on.

What Transfer Learning Actually Taught Me
#

The jump from char-rnn to DialoGPT wasn’t just a technical upgrade. It changed my understanding of what “training” means.

With char-rnn, training was everything. You started from nothing and the model was only as good as the data and compute you threw at it. With transfer learning, the model already knew how to talk. Training was about shaping an existing capability, not building one from scratch.

That distinction matters. When you train from scratch, every flaw in the output is a flaw in your data or your architecture. When you fine-tune, most of the model’s competence is inherited. Your job is narrower and more interesting: teach it who to be, not how to speak.

Skippy went from an anonymous Reddit commenter to something with actual character. Still flawed, still prone to loops and nonsense, still occasionally rude without reason. But recognizably Skippy. The same bot that opened with “F**k you” was now telling carrie “I love you” and giving me deadpan answers about woodchucks.

The architecture made that possible. The personality came from the data. But the growth came from the fact that I could iterate on personality without retraining the entire model. Fine-tune, test, adjust the data, fine-tune again. Each round took minutes, not days. Each round got closer to the character I was aiming for.

That cycle, fast iteration on personality while the base competence stays stable, is the thing that made Skippy a real project instead of a weekend experiment. For the first time, the bottleneck wasn’t compute or architecture. It was figuring out what I wanted the bot to be.

Building Skippy - This article is part of a series.
Part 4: This Article

Related