Current Geek 111: “There are just not enough of those”

Join Tom and Scott as they discuss this: The Google Books blog has an explanation of how they attempt to answer a difficult but commonly asked question: how many different books are there? Various cataloging systems are fraught with duplicates and input errors, and only encompass a fraction of the total distinct titles. They also vary widely by region, and they haven’t been around nearly as long as humanity has been writing books. “When evaluating record similarity, not all attributes are created equal.

For example, when two records contain the same ISBN this is a very strong (but not absolute) signal that they describe the same book, but if they contain different ISBNs, then they definitely describe different books. We trust OCLC and LCCN number similarity slightly less, both because of the inconsistencies noted above and because these numbers do not have checksums, so catalogers have a tendency to mistype them.”

After refining the data as much as they could, they estimated there are 129,864,880 different books in the world.

And, we get a great reader submission to boot!

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

- Direct MP3 Download
- iTunes Subscription
- RSS Feed
- The Frogpants Ultra-Feed

Hey! Why not leave us a nice review on iTunes if you like the show?

This entry was posted in Episodes. Bookmark the permalink.

One Response to Current Geek 111: “There are just not enough of those”

  1. Pingback: What’s happening around the network today?

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>