use information from 200 books and features about those books to determine important features in predicting whether the book is male or female centered
selecting important features (from a long list) is the same as selecting important predictors in marketing (ie who will buy) or in recidivism (who will reoffend)
see slides
test model on another set of books and examine success
predict male or female centered for all remaining books
Grouping Sherlock Holmes books (see the app)
without you providing a list of features, ML groups books based on word frequency
unsupervised ML (compared to supervised ML, where characteristics were compared to known outcomes)
Where are you from? (see the app)
pretty accurate
based on multiple questions each of which provides some probability of assigning you to a location
questions may differ for different people as particular sets of questions indicate particular locations
each question has some info, no one answer is diagnostic
Lyrics (see interactive site)
some lyrics are timeless (Love)
repetition is increasing
number of words is decreasing
vulgarity is increasing
word use reflects society
ML can be used to develop a model to predict genre from lyrics
How we find music has changed
what music you listen to is the result of targeting (eg Spotify suggestions) rather than general play