I attended C-SHALS 2008 at the end of last week... C-SHALS: The conference on the semantic web in healthcare and life sciences. It was a great event, and covered a lot of ground.

The semantic web is a concept that encompasses the idea that computers should be able to read and have some "understanding" of the contents of web pages, so that software can do useful things with the information, rather than needing human interpretation.

By that definition, one might think that Google (and the other search engines) provide some functionality of the semantic web. After all, Google can find web sites based on some keyword input, and if you put the tilde sign (~) before a word in a Google query, it will return pages using all related words. For example, search for "venture capital ~healthcare" and you will get a wider set of results than a search for "venture capital healthcare". Google uses some codification of related words based on syntax (looking for plurals or different forms of the same word) and a good thesaurus to find related words. However, this first level of understanding what is on a web page does not rise to the level required by those active in promulgating the semantic web, most especially the w3c (World Wide Web consortium at MIT).

The Semantic Web is about computer software being able to piece together meaning by combining data from different websites in useful ways. An example is providing data labelled in a way it can be combined with other data from elsewhere. This could be as simple as knowing that the "address" on one website is similar to the "location" on another website. By using labels from agreed vocabulary lists (often called taxonomies or ontologies) you can tell the computer exactly what your data is describing. For example you might have an agreed vocabulary related to government identification documents, and one term is "Passport ID", which is agreed to mean the unique ID number on a passport issued by a sovereign government recognised by the UN. Hence various human readable tables of passport numbers labelled "Passport num", "Passport No." etc could all have a notation which refers to the agreed vocabulary list and the particular term, and thus the data could be combined without worry that there is some mismatch.

C-SHALS looked at combining data from life science research and point-of-care settings, combining data from genetic studies across species, sharing models of knowledge from universities to government regulators, and more. We were reminded, more than once, that the semantic web is the latest attempt to solve problems of making knowledge available to computer programs which used to be called Artificial Intelligence (AI) which is also a separate, if related, field. The business community never saw a great deal of use for AI, although venture capitalists lost a bunch of money proving that. Subsequently there was a wave of Knowledge Management (KM) work, some of which bore fruit, but this was about sharing human knowledge and in many ways the world wide web with all its public and private web sites has become the best success of the KM effort. Now, the KM folks are all working on semantic web.

The reason that C-SHALS was such a success is due in large part to the two co-chairs of the W3C Semantic Web Health Care and Life Sciences Interest Group and the work they do everyday: Tonya Hongsermeier and Eric Neumann. Dr. Tonya Hongsermeier is head of knowledge management at Partners Healthcare, a Boston-based hospital and healthcare system. Her work is at the forefront of applying semantic web technologies to get the right knowledge to the nurse or physician at the right moment in patient care. This is really "mission critical", and Tonya's work is recognised globally as providing leadership in this arena. Eric Neumann is a neuro-biologist turned techie who lived through the Knowledge Management wars at a large pharma company and has combined all this experience into becoming the pre-eminent evangelist and expert for semantic web technologies in the life science arena (esp pharma and bio-tech companies). The fact that Tonya and Eric are both based in Boston help make this area the epicenter for advancing this field, despite the lamentable fact that the only two reasonably successful commercial efforts represented at C-SHALS were from California.

One last comment, for now, must include David Karger's contribution to the field. David is a good friend, an Israeli folk-dance maven, and a professor at MIT. He also presented at C-SHALS, and his work on the Simile project was noted from many angles to be pivotal in showing that the benefits of semantic web concepts could be reaped early and often with simple, yet powerful tools. More on this in future posts. Meanwhile, practice saying Semantic Web a few times - it is easier than tongue-twisters using the phrase C-SHALS, and you will find yourself an expert in something important over the next few years.

No comments: