Saturday, April 24, 2010

A Better Name Generator

In my ongoing quest to provide genuinely useful tools to the RPG community, I considered what kind of tools I'd most like to see. The first thing that came to mind is a better fantasy/sci-fi name generator. Not that there aren't fine name generators out there (see here and here and here). It's just that I've never been fully satisfied. It seems to me that names should feel like they fit within a language and culture and that it should be possible to have names or parts of names that actually mean something.

In less stressful times, I've messed around with constructed languages (see the awesome Language Construction Kit). But even in the best of times, I've never been able to develop much of a vocabulary.

That's where the Fantasy Language Cypher comes in. Using statistics for phoneme frequency in both English* and a user-supplied model, I figured that I could algorythmically transform sample text into something that resembles a fantasy language. And while it is really just a cypher of the original text, it is close enough to a real language that it seems fine for use in RPGs. I hope to use it mostly for names but it's also great for flavor text.

Future Improvements
  • Support for special characters (particularly accented letters and such). 
  • List of favorite model texts (I particularly like the works of Clark Ashton Smith, particularly those translated into other languages)
  • Support for HTML tags in the text to be translated, which would be preserved in the output.

* Derived, appropriately enough, from the text of Risus: The Anything RPG.


Swordgleam said...

I don't know if this was intentional, but simply clicking the translate button again and again with all your default text there causes several alternate translations.

John Morrow said...

The system I've been using to produce words that look like a source language is that I count frequency but not of individual letters. Instead, I count the frequency of clusters of vowels and consonants in three positions (at the start of a word, in the middle of a word, and at the end of a word). I also record which letter clusters follow (e.g., which consonant clusters follow a vowel cluster and which vowel cluster follows a consonant cluster) and use that to make sure that appropriate letters/sounds follow. I broke my programs into an analysis component that outputs the frequency and follow-on information in an editable format (so I can tweak the results) and can cut off the output at a certain frequency threshold and a generator component that takes the analysis output and turns it into words and that allows the number of syllables to be specified.

So if you want to improve your output, I suggest dealing with vowels and consonants as clusters instead of individual letters and figure out some way to take which sounds follow each other into account.

Risus Monkey said...

@swirdgleam: what browser are you using? I'll check it out.

@John Morrow: That's similar what I'm doing. I'm not using a Markov chain in thus particular tool because I wanted the results to be deterministic (if they are not then it's a bug). I count the frequency of consonant or vowel grouping depending on whether they occur in the beginning, middle, or end of words. Then I map (for example) the third most common initial vowel grouping in the English text to the third most common initial vowel grouping in the model text. When there is no mapping (due to size mismatch in the samples) then I pick the most common letter combo of the appropriate type in the model text.

John Morrow said...

The frequency correlation is interesting. The problem with not tracking what sounds normally follow each other is that the vowel and consonant sounds that are allowed to follow each other are part of what makes a language sound the way it does.

Risus Monkey said...

John Morrow wrote:
The problem with not tracking what sounds normally follow each other is that the vowel and consonant sounds that are allowed to follow each other are part of what makes a language sound the way it does.

Yeah, I know. But my intent was not so much to make the translated text sound like the model language (though it often does), rather to make it internally consistent. So if the input is "january february" and the result is "filario carario" (given the pseudo-elvish input default model text). Like English, both words have the same suffix because "ua" (internal) maps to "a", "r" (internal) maps "r" and "y" (final) maps to "io". In effect, you get a language with rules. It's a cypher, because those rules are going to maps to the rules of the source language.

It *would* be nice to go farther and try to more accurately model the vowel-to-consonant (and vice versa) sound frequencies. I may eventually try to do that, in fact. But what I have so far is sufficient for my needs.

Thanks for the suggestions, though. I'd love to see the tool that you are using. :)

m.s. jackson said...

You know.....your blog is so full of gaming goodness, most others simply fail by comparison. I do not know where you find the time to do everything you do, but I am humbled by the sheer volume of gaming ideas/inspirations/insight (wow, all "i"s) and the fact that the quantity is matched by the quality. Awesome stuff!

Risus Monkey said...

Don't know where I find the time, either. I've got two young kids, a wife in grad school, and a demanding job with a (now) nasty commute. But I've got the blogging bug and I'm having a hell of a lot of fun.