How do we discover books? Sometimes we ask people (bookstore and library staff, journalists, friends, family), but more often than not, we ask search engines (“thriller Brazil,” “kids’ books dragons”) But these tools rely on the way books are described. Is the book actually labelled as a thriller? Do we know that the plot takes place in Brazil? How do search engines know that a particular book contains references to dragons?
There are thousands of ways to describe a book and publisher time is limited. And book metadata is not always sufficient for all types of queries — any bookseller who’s been asked to find a book by the colour of its cover can tell you that. Plus, discoverability is being increasingly performed through recommendation engines that are often based on historical sales data, not on the actual content of the books.
Projet TAMIS was launched by Éditions du Septentrion (a history book publisher based in Québec City) in collaboration with Brix Labs (a consulting agency with a focus on technology applied to cultural industries) to address these challenges. Researchers are using algorithms, open source code, and free or cheap APIs to extract descriptive information about books from books and to do it at a granularity and scale that would not be possible for humans. These findings could have huge potential to enrich traditional metadata, as well as create new ways to discover books.
This presentation will share some of the project’s results, which will be made available online and provided to publishers in such a way to make it easy to upload them to their metadata aggregators.