What books will be your bestseller A machine learning approach with Amazon Kindle

What books will be your bestseller A machine learning approach with Amazon Kindle

Keywords:Machinelearning, Book descriptions, Recommendation systems, Content-based filtering, Natural language processing
Abstract:What books will be your bestseller A machine learning approach with Amazon Kindle PurposeWith the rapid increase in internet usemost people tend to purchase books through online storesSeveral such stores also provide book recommendations for buyer convenienceand both collaborative and contentbased filtering approaches have been widely used for building these recommendation systemsHoweverboth approaches have significant limitationsincluding cold start and data sparsityTo overcome these limitationsthis study aims to investigate whether user satisfaction can be predicted based on easily accessible book descriptionsDesignmethodologyapproachThe authors collected a large-scale Kindle Books data set containing book descriptions and ratingsand calculated whether a specific book will receive a high ratingFor this purposeseveral feature representation methodsbag-of-wordsterm frequencyinverse document frequencyTF-IDFand Word2vecand machine learning classifierslogistic regressionrandom forestnaive Bayes and support vector machinewere usedFindingsThe used classifiers show substantial accuracy in predicting reader satisfactionAmong themthe random forest classifier combined with the TF-IDF feature representation method exhibited the highest accuracy at 9609OriginalityvalueThis study revealed that user satisfaction can be predicted based on book descriptions and shed light on the limitations of existing recommendation systemsFurtherboth practical and theoretical implications have been discussedRetrieval and IndexingAlthough this research focuses on books recommendation of online bookstoresits technology can be used for library systems to improve the search and index of booksand provide better recommendations based on description Essence NLPNatural Language TreatmentResearch uses a variety of feature representation methodssuch as BAG-OF-WordsTF-IDFand Word2vec to extract information from the book descriptionMachine learningA variety of machine learning classifierssuch as Rojis returnrandom forestsNaive Bayesand support vector machines to predict userssatisfaction with a bookData explorationThe purpose of research is to predict whether a book will get high scores based on a large number of Kindle books data setsincluding books description and score