There are many kinds of database

Bob Metcalfe always used to say there are just two kinds of network: Ethernet and Ethernot. He was commenting that any network not based on Ethernet was going to end up losing to Ethernet in the marketplace. With the demise of AppleTalk (Apple), Token Ring (IBM) and DECnet (Digital), he was more or less proved right.

Bob has an axe to grind, as one of the co-inventors of Ethernet, but that doesn’t make him wrong. This is not Metcalfe’s Law, coined by the same Bob Metcalfe, that the value of a network increases proportionally with the square of the number of participants. Perhaps I can call the Ethernet/Ethernot comment Metcalfe’s Hypothesis (not all his predictions were correct).

Sigma’s investment in Gainspan is a bet that Metcalfe’s Hypothesis will continue to be right, this time with respect to Zigbee and other approaches proposed for sensor networks.

The database world has been dominated by the relational database model for many years, and all the famous commercial databases such as Oracle, Sybase, IBM DB2, MS SQL Server and open source MySQL are relational database management systems (RDBMSs). The relational data model referred to can be thought of as tables of data which are related to each other (like a table of bank accounts is related to a table of banking transactions). These systems are also known as SQL databases because they use Structure Query Language (SQL) for programming, and SQL was created for RDBMSs and is an absolute standard for their use. Separate from the data model and the language to talk to the data, RDBMSs are mostly expected to be able to manage transactions properly (and be ACID compliant). Simply put, this is the capability to ensure that, for example, a money transfer is properly recorded in both the sending and receiving accounts, and making it impossible to record only one side of the transaction without the other.

Now there are a new gang of databases in town, and they have gathered themselves together under the “NoSQL” banner. These are clustered around BigTable and Hadoop technologies made famous most notably because of their Google-related provenance.

Many (most? all?) NoSQL databases do not have transactional or ACID capabilities, and don’t seem to need it, at least for now, because the use cases that prompted their development really are that different. NoSQL databases are not gaining popularity because they are better at the same things as an RDBMS. This is not about more or faster money transfers. This is about new problems which don’t need transactional integrity and do need to analyze data sets bigger than an RDBMS can economically manage (or manage at all). Sigma’s investments with Michael Stonebreaker (see earlier post) are, in many ways, bets on the future of multiple kinds of databases flourishing.

So I am tempted to postulate Dale’s Hypothesis, somewhat in opposition to the idea of Metcalfe’s Hypothesis, that there can be many kinds of database, and you don’t need SQL to have a sequel.

1 comment:

Dan Greenberg said...

Good article Richard.

In a slightly different realm, I'd add a corollary to Metcalfe's Hypothesis: there's HTML and HTMLnot. Until recently, this did not seem to be the case in video -- there are a lot of video implementations that vary across platforms in general and even on the web itself. (Think Flash, QuickTime, Silverlight, and more.) But HTML5 seems to be winning the battle:
Google bought On2, but has HTML5/H.264 experiments for YouTube.
Apple has battled against "proprietary" Flash with open HTML5 in Safari (including iDevices).
Adobe is proposing ad formats which can use Flash... or HTML5:
And it looks like Microsoft is transitioning Silverlight, or abandoning its video delivery, in favor of HTML5:

I know where I'm placing my bets in video!