From "People have exactly one canonical full name" to "People have names": names are hard.
Supposedly "anonymous" datasets have a history of revealing far more personal information (including identifying details and details which could reasonably be considered private, such as sexuality) than intended.
Brave new non-relational world:
For this feature, the fully denormalized Cassandra dataset weighs in at 3 terabytes and 76 billion columns.
Most of the first-generation mashups have been limited to the data freely available from governments, sites with APIs etc.; I can't wait to see what we'll get now that they can use Freebase. I can't wait to use it myself, for that matter. Best of all, those magical letters: "CC-BY".
Piggy Bank plugins are site-specific screen-scrapers that extract structured data as RDF, which can be analyzed, sorted and connected at your leisure, or shared through a "semantic bank".