Plus, also, a word generator update
Added 2020-11-14 10:08:08 +0000 UTCI forgot to include this yesterday, but also, here's an update to the word generation script I use. I've fixed a bug that caused the generator to crash when generating new Russian words because I did not think about the implications of apostrophes. I've also added a new feature, "limit." This checks to see whether a specified length of characters already exists in the selected corpus, which if nothing else helps to prevent situations where the generator, when running in exclusive (i.e. only return new words) mode, would decide that "downtowns" was a new English word even though it's just "downtown" plus the letter "s."
In most cases this won't actually matter! I added it in because a friend asked me if it would be possible to use Markov chains to generate descriptions for wines, and I loosely adapted the word generator to do so. Specifically, because my word generator works at the character level, I cheated by taking a sample of wine descriptions, converting all the unique words to Unicode glyphs, and treating it as any other word-generating problem.
The catch here is that wine descriptions are so ridiculous that I had no good way of judging whether or not I was actually generating new ones :P so I added in the "limit" function. I've attached the wine generator as well, if you want to play around with it, as well as the seed file (ha, ha) I used for the experiment. If you make changes to the seed, running wine_preprocess.py will output a text file consisting of "words" that are really sentences with unicode characters standing in for each word.
The result, depending on your parameters, is somewhere between the normal ("this wine explodes with the blended notes of raspberry, game and earth"), the banal ("this elegant Pinot Noir has aromas and flavor") and the Dwarf Fortress:
Fascinating aromas of strawberry, cherry and plum, the wine has a silky palate of soft spice. 16 months in French and American oak add complexity to the expansive finish. Dark ripe raspberries, spices, lavender, hints of raspberries with notes of red fruit, such as plums, raspberries and hints of spice and earth, finishing smoothly. This delicious Pinot features black cherry fruits, raspberries, oolong tea leaves, raspberries and pipe tobacco. It evolves into a floral, pretty finish.