While learning about sequence models as part of the Deep Learning set of courses offered by deeplearning.ai one of the exercises was to implement bits and chunks of code of a recurrent neural network that was trained on a set of known names of dinosaurs and then used to generate novel dinosaur-sounding names that don’t exist.

The following diagram gives an overview of how this RNN works when generating names. There is actually just one cell that is repeatedly activated by a sequence of inputs, where time is moving from left to right. Each input x consists of a single English letter or a newline character (which is a separator for the names). Each input a represents internal state that depends on the part of the sequence previously seen. Each output ŷ is one letter chosen based on a probability distribution that was created by the training phase. The accumulation of these outputs forms a generated name. (This is not a tutorial on RNNs, so I will not elaborate any further.)

Recurrent neural network to generate names


I thought it would be a fun exercise to apply this RNN model to generate a different kind of name, namely, Old English-sounding people names.

I found a website with a set of 299 Old English girls names and a set of 300 Old English boys names that serve as training data. I used the girls and the boys names as separate data but also concatenated them into a combined data set.

Using the boys names, 40_000 training iterations, and default values for the other parameters resulted in the following output, with selected snapshots along the way.

Iteration: 2000, Loss: 21.291615

Mhxvtrbn
In
Iyrtrbmdlldwerlrareletr
Md
Ystrbmdlldwerlrareletr
D
Uuran

Iteration: 18000, Loss: 13.390953

Pewton
Melbard
Murpt
Rad
Wirk
Hacford
Tore

Iteration: 34000, Loss: 12.374373

Rawston
Racfartreabroough
Ruwson
Racearumae
Worton
Macenteldrocm
Tronrerley

Iteration: 38000, Loss: 11.951417

Raynord
Radbarrorallon
Ritton
Rad
Uxrield
Rackley
Stanfield

Iteration: 40000, Loss: 11.763989

Raynnelbridft
Redcalllid
Rutton
Rach
Trowdeld
Hachet
Rouke

The values of the loss function quickly flatten out, generating some names that sound pretty good! I especially like Rawston, Raynord, and Worton. Occasionally, the model also generated an existing name such as Stanfield.

Using the girls names and the combined set yielded similar results. The first iterations more or less produce garbage, then the loss function drops fast and plausible names start to come out.

Some other very nice names that came out of the other data sets were Reynora, Wyrth, and Lytira.

Training iterations

I decided to try increasing the number of training iterations.

Increasing the iterations from 40_000 to 50_000 produced:

Iteration: 48000, Loss: 10.967822

Raynon
Radcad
Rutster
Rachaston
Wyndan
Ladcnesbarn
Tondell


Iteration: 50000, Loss: 10.976192

Raylot
Rack
Rowlon
Rachat
Worton
Olally
Wilks

To 60_000:

Iteration: 58000, Loss: 11.220603

Raynley
Leich
Lowst
Rafbert
Worthall
Eabe
Word


Iteration: 60000, Loss: 10.235180

Raynop
Leich
Lowsten
Rachash
Wostar
Eabert
Tondolp

Finally, to 70_000 iterations, at which point most of the generated names looked very plausible.

Iteration: 68000, Loss: 10.159134

Enwrid
Esled
Eswood
Ellad
Wyntel
Edbert
Remar


Iteration: 70000, Loss: 10.670650

Remley
Lefabe
Lowler
Radallege
Worthwele
Edbert
Won

Learning rate

I also tried decreasing the learning rate from the default value of 0.01 to 0.005.

Again running 70_000 iterations, the tailing results were as follows. The loss decreased, but I don’t think the generated names were significantly better.

Iteration: 68000, Loss: 9.361359

Settowele
Rele
Ruynegrattard
Sedbyn
Witton
Paddley
Wirkley


Iteration: 70000, Loss: 9.326370

Setwy
Regh
Ruyne
Sefbroume
Westan
Paiford
Wendels

And then reducing the learning rate to 0.003.

Iteration: 68000, Loss: 9.297729

Rayton
Ramad
Rownuld
Radald
Wisten
Madburn
Wiok


Iteration: 70000, Loss: 9.504666

Raynord
Rald
Rrynselengloorwockley
Radbith
Wrondelsdesterodley
Madburme
Wiok


This was a very fun exercise to do. I could see other applications of this RNN, such as creating brand names for medications, baby names in other languages (this would work well for Swahili), and mythical place names for fantasy fiction.

Later: while in the midst of writing this blog post, I happened to discover the blog post by Andrej Karpathy on recurrent neural networks in which he briefly describes using an RNN to generate general baby names! His post was written in 2015 so his work predated mine by several years, 8-D



Sources:

    Sequence Models
    https://www.coursera.org/learn/nlp-sequence-models

    Old English Girl Names
    http://www.top-100-baby-names-search.com/old-english-girl-names.html

    Old English Boy Names
    http://www.top-100-baby-names-search.com/old-english-boy-names.html

    The Unreasonable Effectiveness of Recurrent Neural Networks
    https://karpathy.github.io/2015/05/21/rnn-effectiveness/