the whole world burns

Archive for category 'data'

Falsehoods Programmers Believe About Names

 #

From "People have exactly one canonical full name" to "People have names": names are hard.

Broken Promises: Responding to the Surprising Failure of Anonymization

 #

Supposedly "anonymous" datasets have a history of revealing far more personal information (including identifying details and details which could reasonably be considered private, such as sexuality) than intended.

Digg's "your friend has Dugg this" dataset

 #

Brave new non-relational world:

For this feature, the fully denormalized Cassandra dataset weighs in at 3 terabytes and 76 billion columns.

OpenAustralia

 #

I was going to express a fond wish that data.australia.gov.au was a good first step towards the kind of free access that could give us an Australian TheyWorkForYou -- then discovered that it's existed since June 2008.

data.australia.gov.au

 #

Downloadable Australian government datasets under open licenses!

Freebase: "an open, shared database of the world's knowledge"

 #

Most of the first-generation mashups have been limited to the data freely available from governments, sites with APIs etc.; I can't wait to see what we'll get now that they can use Freebase. I can't wait to use it myself, for that matter. Best of all, those magical letters: "CC-BY".

Piggy Bank - turn your browser into a mashup platform

 #

Piggy Bank plugins are site-specific screen-scrapers that extract structured data as RDF, which can be analyzed, sorted and connected at your leisure, or shared through a "semantic bank".

Small things, links and miscellany, sparkling with light. Sam's tumblelog.

Related Tags