Zeroes and Ones and Your Odds of Writing a Best-Seller

best-sellers

Did you ever suspect the runaway best-seller Fifty Shades of Grey was written by robots?  Well, somebody check E.L. James for vital signs because she might actually be an algorithm.  Check this out:

wordrep

Surely a human being would die of boredom before biting a lip in print forty-three times in one novel.

Actually, I’m skewing things a bit.  But it is true that “[s]cientists have developed an algorithm which can analyse a book and predict with 84 per cent accuracy whether or not it will be a commercial success.” (Source)

By downloading books in public domain from Project Gutenberg , scientists from Stony Brook University in New York developed a program called “statistical stylometry, which mathematically examines the use of words and grammar” to determine the popularity of a book, matching the programs results to the sales of works from the past. The experiment involved a wide range of literary styles, from science fiction, to novels, to poetry. Factors in determining sales and popularity included the “style” of writing as well as novelty in plot and character (they do acknowledge that “luck” plays a role as well.)

The program accurately predicted success, or failure, of those works an astonishing 84% of the time.

So what factors seemed to indicate, in a more concrete way, what you should do to increase your odds of becoming a best-selling writer?

  1. Use a lot of conjunctions.
  2. Use a lot of nouns and adjectives.
hemingway_gun
For the record, Papa Hemingway Disapproves of This “Advice”

Avoid doing these things:

  1. An abundance of verbs and adverbs.
  2. Explicitly describing “actions and emotions such as “wanted”, “took” or “promised.”

The authors of the program, of  course, stand by their conclusion, arguing that:

“Previous work has attempted to gain insights into the ‘secret recipe’ of successful books. But most of these studies were qualitative, based on a dozen books, and focused primarily on high-level content – the personalities of protagonists and antagonists and the plots. Our work examines a considerably larger collection – 800 books – over multiple genres, providing insights into lexical, syntactic, and discourse patterns that characterise the writing styles commonly shared among the successful literature.”

With results like this, one wonders if this algorithm might truly be applied to hopeful writers, both those awaiting publication as well as already established writers.  Will a publisher take a chance on a novel that might take time to build an audience? One that scored “low” on the likely success meter?