Select a text section to find similar passages from other fairy tale texts — using an in-browser embedding index and search. Some text elements will award you points in the categories I've defined, common in fairy tales. Magical creatures are harder than you might think! You can add to your score by adding your own search terms to a "Yours" category.
Higher scores are given for rarer items (there is a python script in the repo that weights the scores). These terms were originally seeded by looking at top unigram and bigrams, then added to during use and weighted with Claude's help. The icons were made with a glif.ai app that uses a Flux Lora for a medieval style and adds a label.
The similarity score under the text shows how "similar" a new text is to what you selected. This app uses a tiny embedding model, bge-micro, with Transformers.js from Huggingface. The index and similarity search in the browser are handled with client-vector-search, a node package.
The text is sourced from Project Gutenberg book fairy tales, processed into sentences and then classified for "descriptiveness" using a small trained custom spaCy model. The dataset was later reduced quite a bit to a very few authors (4!) for this demo app. There may still be duplicate stories (depending on the book collection) and there may also be lines that contain offensive text (racism, sexism, violence and gore) for today's readers. Fairy tales aren't sweet.
Made by @arnicas/Lynn Cherny who is on Bluesky, mastodon, X; the repo is at the badly named github.com/arnicas/simple-embedded-text-navigator. This was originally both a tech demo (making an in-browser fairy tale text navigator) and a toy for me to look at fairy tale content.
Add words and phrases that you want to track in the "Yours" category:
No words added yet. Add some words above!