Using a Recurrent Neural Network to Generate 'Old English' Names
While learning about sequence models as part of the Deep Learning set of courses offered by deeplearning.ai one of the exercises was to implement bits and chunks of code of a recurrent neural network that was trained on a set of known names of dinosaurs and then used to generate novel dinosaur-sounding names that don’t exist.
The following diagram gives an overview of how this RNN works when generating names. There is actually just one cell that is repeatedly activated by a sequence of inputs, where time is moving from left to right. Each input x consists of a single English letter or a newline character (which is a separator for the names). Each input a represents internal state that depends on the part of the sequence previously seen. Each output ŷ is one letter chosen based on a probability distribution that was created by the training phase. The accumulation of these outputs forms a generated name. (This is not a tutorial on RNNs, so I will not elaborate any further.)
I thought it would be a fun exercise to apply this RNN model to generate a different kind of name, namely, Old English-sounding people names.
I found a website with a set of 299 Old English girls names and a set of 300 Old English boys names that serve as training data. I used the girls and the boys names as separate data but also concatenated them into a combined data set.
Using the boys names, 40_000 training iterations, and default values for the other parameters resulted in the following output, with selected snapshots along the way.
Iteration: 2000, Loss: 21.291615
Mhxvtrbn
In
Iyrtrbmdlldwerlrareletr
Md
Ystrbmdlldwerlrareletr
D
Uuran
Iteration: 18000, Loss: 13.390953
Pewton
Melbard
Murpt
Rad
Wirk
Hacford
Tore
Iteration: 34000, Loss: 12.374373
Rawston
Racfartreabroough
Ruwson
Racearumae
Worton
Macenteldrocm
Tronrerley
Iteration: 38000, Loss: 11.951417
Raynord
Radbarrorallon
Ritton
Rad
Uxrield
Rackley
Stanfield
Iteration: 40000, Loss: 11.763989
Raynnelbridft
Redcalllid
Rutton
Rach
Trowdeld
Hachet
Rouke
The values of the loss function quickly flatten out, generating some names that sound pretty good! I especially like Rawston, Raynord, and Worton. Occasionally, the model also generated an existing name such as Stanfield.
Using the girls names and the combined set yielded similar results. The first iterations more or less produce garbage, then the loss function drops fast and plausible names start to come out.
Some other very nice names that came out of the other data sets were Reynora, Wyrth, and Lytira.
Training iterations
I decided to try increasing the number of training iterations.
Increasing the iterations from 40_000 to 50_000 produced:
Iteration: 48000, Loss: 10.967822
Raynon
Radcad
Rutster
Rachaston
Wyndan
Ladcnesbarn
Tondell
Iteration: 50000, Loss: 10.976192
Raylot
Rack
Rowlon
Rachat
Worton
Olally
Wilks
To 60_000:
Iteration: 58000, Loss: 11.220603
Raynley
Leich
Lowst
Rafbert
Worthall
Eabe
Word
Iteration: 60000, Loss: 10.235180
Raynop
Leich
Lowsten
Rachash
Wostar
Eabert
Tondolp
Finally, to 70_000 iterations, at which point most of the generated names looked very plausible.
Iteration: 68000, Loss: 10.159134
Enwrid
Esled
Eswood
Ellad
Wyntel
Edbert
Remar
Iteration: 70000, Loss: 10.670650
Remley
Lefabe
Lowler
Radallege
Worthwele
Edbert
Won
Learning rate
I also tried decreasing the learning rate from the default value of 0.01 to 0.005.
Again running 70_000 iterations, the tailing results were as follows. The loss decreased, but I don’t think the generated names were significantly better.
Iteration: 68000, Loss: 9.361359
Settowele
Rele
Ruynegrattard
Sedbyn
Witton
Paddley
Wirkley
Iteration: 70000, Loss: 9.326370
Setwy
Regh
Ruyne
Sefbroume
Westan
Paiford
Wendels
And then reducing the learning rate to 0.003.
Iteration: 68000, Loss: 9.297729
Rayton
Ramad
Rownuld
Radald
Wisten
Madburn
Wiok
Iteration: 70000, Loss: 9.504666
Raynord
Rald
Rrynselengloorwockley
Radbith
Wrondelsdesterodley
Madburme
Wiok
This was a very fun exercise to do. I could see other applications of this RNN, such as creating brand names for medications, baby names in other languages (this would work well for Swahili), and mythical place names for fantasy fiction.
Later: while in the midst of writing this blog post, I happened to discover the blog post by Andrej Karpathy on recurrent neural networks in which he briefly describes using an RNN to generate general baby names! His post was written in 2015 so his work predated mine by several years, 8-D
Sources:
Sequence Models
https://www.coursera.org/learn/nlp-sequence-models
Old English Girl Names
http://www.top-100-baby-names-search.com/old-english-girl-names.html
Old English Boy Names
http://www.top-100-baby-names-search.com/old-english-boy-names.html
The Unreasonable Effectiveness of Recurrent Neural Networks
https://karpathy.github.io/2015/05/21/rnn-effectiveness/