
The character counting stats here ignore some punctuation characters. There is a graph at the bottom of the page. Example of calibrating with only character count as a source variable: It is completely useless for VNs with nontrivial amounts of gameplay, or where rereading is expected, or where you're expected to miss a lot of content, or where you're only meant to read one or two out of several routes. It's calibrated based on a bunch of vndb length ratings with "very short" assigned to 1 hour and "very long" assigned to 60 hours.

It assumes a reading speed of 475 characters per minute, in general. The hours estimate is kanji/18000 + hiragana/31500 + katakana/61400. An exact description of the processing for these metrics is here. The "metric a/b/c/d" lines are derived from linear regression targeting the "freqlist 92.5% Target" metric. This entire process ignores grammatical words, proper nouns, and numbers. However, the most common uncovered 20 words in the script are ignored, because they're probably names or setting jargon, or even parsing errors. Commonness is based on every long-enough script combined, not on general purpose frequency lists or the frequency list of the single script in question. High measurements mean that the script uses uncommon words more frequently, or common words less frequently. The "freqlistX% Target" metrics are an estimate of lexical complexity.

If you want to add a script, post it to the Requests page. TODO: Fix scripts with broken linewraps (affects sentence count, line count, and metrics a/d).Īll stats work on deduplicated copies of the script. Click on Old Stats if you want to see the old stats, like the Core 6k or Hayashi stats.
