Phase Forward: Remembering the Early Days

2012-08-29_11-03-47_262

I went to Newton City Hall today to run an errand and saw the Balsamo/Millenium walkway that the city created to celebrate the year 2000. I remembered that when Paul Bleicher and I heard about this walkway in 1999 we thought it would be fitting to mark the fact that Newton was the first home of Phase Forward, the company we had co-founded a couple of years before.

In early 1999 it wasn’t clear how successful Phase Forward would go on to be. However, we were proud enough, even then, to have started the company in Newton, where we also both lived (and still do) with our respective families. Starting in 1996 we spent many hours in a study room in the Newton Free Library working on a business plan. Once we received venture capital funding (from Atlas Ventures and North Bridge Venture Partners) we opened our first office at 51 Winchester St, Newton in 1997. It was a couple more years before the company moved out to Waltham. The later history of Phase Forward included an IPO and then a successful acquisition by Oracle Corporation in 2010.

I am very proud that I got to work with Paul on his ideas that became so successful at Phase Forward. We had fun, changed an industry, created a bunch of jobs, and got to see some success along with way.

Hopefully the Phase Forward name will remain clear on this pathway as a mark of our gratitude for starting out in Newton, even as the name fades from prominence now it is no longer an independent company.

Now this is venture cycling!

Check out this wonderful three minute movie – watch to the end – and see how venture capital and cycling really come together.

The Invisible Bicycle Helmet | Fredrik Gertten from Focus Forward Films on Vimeo.

The link, in case everything else fails is: http://vimeo.com/43038579

A Better Definition of Big Data

As I have noted, Big Data is called that because of the problems it causes – it’s too big for this or that kind of processing or usefulness (ever tried looking through a million rows of data for something interesting?) The conventional definition of big data is slightly tautological: data that is big (or “too big”) in volume, velocity or variety for the current generation of hands-on data tools.

The kinds of data driving this big data wave are, as I hinted previously, high granularity data from systems and sensors recording behavior of (and inside) environments, people, markets and machines. This is in contrast to previous generations of data processing, designed to record and analyze the results of behavior, such as events and transactions, but not the behavior itself. In this era of big data, we have reached a new limit – something never usually considered a limit at all. That limit is Moore’s law.

Moore's law is the observation that computer power (for a given price) doubles approximately every two years. For a long time, Moore’s law would ensure regular general-purpose computers would improve fast enough to keep the databases humming with no need for (more expensive) special purpose products. 

Then something happened. IT departments in large corporations started experiencing problems that outstripped the general improvements in computer power being delivered by Moore’s law. New technologies became commercial successes because these IT departments started to spend money on solving such problems. In 1996 TimesTen spun out of HP with a new approach to solve one relational database performance problem and became very successful in its particular corner of the market. Just as TimesTen was becoming successful, Boston-based Netezza launched its own very successful product that was, effectively, a special purpose computer for another key part of the relational database market.

In retrospect, these commercial successes heralded the start of the big data era. They illustrated, with the power of the market (including a great IPO for Netezza, and later acquisitions of both), that data growth was outstripping Moore’s law and new approaches were in demand. Both these companies relied on new arrangements of hardware (with proprietary software) to achieve new levels of performance. However, the big data wave quickly swung round to clever new software designs and algorithms, and clever new ways to parcel out problems to lots of regular computers working in parallel. Along with new hardware, new software approaches are just as powerful in coping with the big data that is outstripping Moore’s law. These include column stores (e.g., Vertica), Google’s famous map reduce algorithm (e.g., Hadoop) and now many more.

All this leads to my definition of big data:

data with velocity, volume and/or variety growing faster than Moore’s law

As a footnote, the recent announcements I covered in May, headlined by Intel’s massive grant to MIT related to big data research, brings this full circle. Gordon Moore coined Moore’s law just a few years before co-founding Intel.