Sketched portrait of Patrick Coleman

pscoleman.me/


What I learned building an AI language learning app


Part 3/4: More background

Welcome back again! In this post, I'll share my personal backstory, my share of the work, and the past lives of Yaya. Read on.

Note: If you're just joining us now, I recommend starting at the beginning. And feel free to play around on the deprecated, v1.0

An illustration of a magical forest in a book

The story begins (image credit: DALL-E)

The seeds for Yaya were planted a long time ago. Let's briefly go back to the very beginning (or skip ahead to what I worked on).

I have an early memory of learning the French colors sitting cross-legged on the floor of a sun-drenched classroom way back in the mid 90s. But my interest didn't really take off until an "introduction to foreign languages" class in middle school in the early aughts.

First it was Spanish. I studied all through high school and college, picking up a minor in Spanish literature (talk to me about writers like Jorge Luis Borges or Roberto Bolaño anytime). Then in my middle-ish-twenties, I started learning Japanese and spent two months living in Tokyo just before COVID. My goal (recently accomplished) was to comfortably read a Haruki Murakami novel in Japanese. A few years ago, I started dating a Korean American (now married) and learning Korean too.

Also in high school, I started taking compsci classes, placing directly into AP CS and going on to take classes in AI and computer architecture (a lot has changed since 2007...). I was lucky to go to a science and tech school and got absorbed in coding. But then I studied finance in college.

Later as my professional career shifted from Wall Street to software startups, I kept toying with the idea of finally learning to code properly. I considered doing a bootcamp or going back to school, but I didn't think it'd be the right fit. My background knowledge was all over the place, I've typically prefered self-directed learning with tutors as an adult, and many institutions are teaching outdated curriculum. As part of a consulting project, I looked at the tech stacks of startups over the past decade and saw a big shift to serverless architectures, which isn't taught as widely.

Then finally in the fall of 2022 (well into my early thirties), I was cutting back on some consulting hours and thinking about a new project. I wanted to learn to code and still wanted to tackle that Murakami novel. David and I had a catch up call, and I shared an idea I'd been thinking of to interlace foreign texts with professional, i.e. human, translations (inspired by Read Real Japanese and similar books).

It turns out David had been using Harry Potter to learn Chinese with his wife before bedtime and had already built a simple prototype using Google Translate for the translations. I asked David to teach me how to code, and in the first few days of 2023 we started working on Yaya together.

^ to ToC

An illustration of a hero fighting a dragon in a book
Slaying the dragon with a little help

Since this was my first time coding anything substantial, needless to say, I leaned on David a lot. Before this I'd taken a fair amount of computer science classes in school, wrote some VBA macros for Excel and PowerPoint early in my career, fixed some website typos (directly in text fields in Github, which felt like flying blind), and created some devtools marketing content while working at startups. So David built the initial app and set up hosting. Then I worked in his codebase/system. That in itself was some great learning.

At the start, I did frontend work (building out the UI and adding features) and some data/algorithm manipulation. It was seriously fun banging my head against the beam search algorithm. Firebase was tricky at times and RXJS Observables consistently confounded me, but I kept plugging away with David's (and ChatGPT's) help.

As things progressed, I got a bit more comfortable. I kept working on the frontend and got more confident on my own. From time to time, I'd also work with the data, but as we started getting a few users and wanted things to stay stable, this was mostly just in pair sessions with David. I also spent an inordinate amount of time on prompt engineering and testing.

And in the midst of all of this I got married and decided to take some time off work to travel the world :)

The big feature that I worked on before wrapping up work and leaving on a long honeymoon was adding audio to the app. It was challenging and David had to do a pretty big refactor (haha), but it felt good shipping a tangible and meaningful feature.

Of course I also did various non-engineering work on building Yaya, like product design, user interviews (with very patient/generous friends, family, and acquaintances), copywriting, etc. I worked with web and logo designers via fiverr and incorporated the business with Clerky too. In retrospect, I pobably spent too much time on these things relative to working on the product given our stage (more above if you missed it).

In the end, I asked David how much software engineering I'd really learned. And he said he'd hire me as a junior engineer, which felt good and validated all the effort. As mentioned though, I'm more passionate about the business stuff, so that's where I'll focus for now. Hopefully I'll find something that marries these interests.

^ to ToC

An illustration of a scary fantasy forest in a book
Crossing the threshold and into the woods

Now on to the earliest failures and learnings.

v0.1

The very first version of Yaya matched and interlaced text in two languages. It let you upload the text of a story, novel, or any content in another language plus text for the translation, and it would give you sentence by sentence mappings.

David and another friend built the proof of concept as a weekend project using sentence embeddings, a matching algorithm (beam search with vector difference), and a simple web interface. I started testing it with Japanese short stories and English translations. It was a bit clunky and didn't always get the matching right, but it was enough to get started. It was really interesting seeing the techniques of different translators too.

I plugged away on updates to the web interface (on my own and in pair programming sessions with David), and he added more features like an option to use Google Translate, instead of an uploaded translation, and an ePub export, so you could read on an eReader or app. We took turns tweaking the matching algorithm and got improvements, but with neither of us having PhDs in this stuff, we were only able to get so far.

Before too long, we realized this wasn't going to be useful outside of a tiny and committed audience. It was a ton of work to get or find DRM-free content with translations. Most readers of literature in other languages are already able to read original texts or will only need a tool like this for a short period of time as intermediate learners (unless maybe your job is writing/comparing translations, e.g. in classical or religious studies). And we wouldn't want to ask people to deal with (or build the interface to fix) an imperfect matching algorithm.

v0.2

Next (for a short time), we pivoted to public domain classics that we could pre-upload in a variety of languages. The main issue here was that while we solved the problem of uploading content for the user, we didn't solve any of the other problems, namely that only a tiny audience (of mostly experts) wants to read classics in the original.

By now we'd kept plugging away and improving the interface. We had a working app for reading foreign language content, but alas, no good content.

v0.3

So then we tried chatGPT-generated stories. I pushed back at first, because who wants to read the hallucinations of a soulless algorithm? Although some folks may take issue with calling it “soulless”...

Instead I quickly found that the AI-generated content was as good (or at least sufficiently good for learning) as a lot of the human-written bilingual graded readers out there. And, like the library of babel, it had infinite content. So go on and read those space operas featuring superintelligent frogs or those paranormal romance whodunits to your heart's content.

It wasn't gonna be great literature, but at this point we had something. Unfortunately we also had a big, bloated app doing too many things. This made it far more complicated than necessary. So David did a huge refactor, and we cut nearly all the features, except the core reader and AI content generation.

v0.4

From here on, we started getting the product in front of more users (who were not us) and making iterative improvements. We did a ton of "high tech" prompt engineering to get better quality and variety. Some of the big feature additions were: spaced-repetition vocabulary to include in generated content, flashcards, audio, and clickable definitions + grammar explanations for any word or phrase (also AI-generated). We also ironed out some kinks with poor performance in some languages (starting with AI-generated English content and then translating for those languages).

By this point in time we had built a good product, but we struggled to get people to use it, which after all, is what counts. More on this in the prior post.

^ to ToC

Thanks for reading! Next up, we've got the final post in the series, a deep dive into our tools.

  • Part 1: An introduction to Yaya v1.0, plus some reflections
  • Part 2: A deeper dive on the lessons learned while building Yaya
  • Part 3 (this post): The backstory: how I got to Yaya, what I did, and our early iterations
  • Part 4: The tools we used to build Yaya and some final thoughts

< prev
[ all ]
next >