What Is The Web?

October 27, 2008

This question doesn’t usually matter very much, but sometimes we get into arguments about whether something (eg e-mail, streaming video, instant-messaging) is part of “The Web” or not.

Due to various circumstance last week I found myself drafting this text about it (while kicking myself and telling myself it was a waste of time). Jet-Lag Insomnia, more or less.

I think the main value of my text, or any effort like this, is in revealing our often-unstated ideas about what the Web can or should become. Is this “architecture”? When you renovate a building, adding new wings, how much are you changing its architecture? How much are you changing its subtle good and bad features? Certainly it’s good to think about that question before making the changes.

It is also nice to get our terms straight. I loathe the term “Resource” and quite dislike the term “URI”. Along the way here, I try to define what I think are the key terms in Web Architecture. For extra credtit, I end up with defining a “Web Standard”.

Wikipedia says the Web is:

…a system of interlinked hypertext documents accessed via the Internet.

In contrast, WebArch says it is:

…an information space in which the items of interest, referred to as resources, are identified by global identifiers called Uniform Resource Identifiers (URI)

Without going into my dislike for WebArch, I really wish the Semantic Web would/could (somehow) stick to a definition closer to Wikipedia’s.

My one-line defintion is this:

The Web is a global communications medium provided by a decentralized computer system.

In more detail:

“The Web is a global…”: Although conceptually there could be many different “webs”, there is one which is understood as “The Web”. The Web follows (and uses) The Internet in being designed to connect different local systems. An installation of web technology usually ends up connected to others, becoming part of the unified global Web, because in most situations the value of doing so greatly outweighs the costs. The end effect is a single, integrated system built up of all available, connected components.

“communications medium” : The Web provides a way for people to communicate with each other. It does this by letting them create web pages (often collected into web sites), with unique names (the web address, or URL), which other people can view and interact with. The system does not restrict what exactly constitutes a “page” (sometimes called a “resource): originally, Web pages were essentially an on-line of paper documents, but they have evolved to now provide, within each “page”, a full user interface to remote computer systems. The web addresses are essential, because they allow people to communicate about particular pages and, crucially, they allow one page to name another (to link to another) so that user can learn about and “visit” (use) other pages.

Although generally intended for use by people, the Web is sometimes used by other computer systems. A search engine traverses the Web like a user, and then helps users find the pages they want. The Web Services and Semantic Web standards provide various ways for computer systems to interact with each other over the Web, attempting to leverage Web infrastructure as an element in new systems.

“decentralized computer system”: While the Web is in one sense a single system, it is composed of other computer systems, most of which serve as web servers or web clients. It has no central point of control (except perhaps the Domain Name System (DNS), which is part of the underlying Internet); instead, the system’s behavior for a particular user depends on the clients and servers being used by that user. Many features of the Web rely the behavior being essentially the same for all users, and that consistency depends on the underlying systems behaving consistently. Where there is consistent behavior, and that behavior is documented, the document is a Web Standard.

My Web 3.0 Prediction

October 18, 2008

In the wrap up session for the Web 3.0 event, I tried to crystalize my prediction for Web 3.0. In doing so, I was trying to step away from the Semantic Web as a goal, and instead just think about what I think is inevitable.

So my prediction is like this:

Some people say Web 2.0 is about Ajax. Other people say it’s about websites connecting users to users, forming on-line communities around read-write websites. I think these two notions are related in that without Ajax, websites were so clunky that full participation by the mass of users was impractical. With Ajax, developers could make sites that were both powerful and comfortable.

By the same token, Web 3.0 will be about Semantic Web technologies enabling a set of noticeably more powerful and convenient applications. Most crucially, it will be about everyone who maintains some data making it available in a standard form, so applications can be written to use data from many sources. These application will feel different; they will appear to “know” a lot.

For the techies, Web 3.0 will be about RDF, like Web 2.0 was about Ajax. But for users, it will be about software systems which have access to all the data they can effectively use, instead of being dumb little things, trapped each in its own little box.

In the event’s analogy format, turned sideways, I guess we could say RDF is to Web 3.0 as Ajax is to Web 2.0. Ajax was the enabling, trigger technology for Web 2.0. RDF (or something like it) will be the enabling technology for Web 3.0, enabling a whole set of applications that are prohibitively difficult without it.

Reblog this post [with Zemanta]

From Footpaths to Freeways

October 17, 2008

Here at the Web 3.0 Event, they’re giving out t-shirt with two blanks on them, saying “Web 3.0 is to Web 2.0 what ____ is to _____”. Somewhere, there are pens, and you’re asked to fill them in.

My first thought (surprise, surprise) was that Web 3.0 is about decentralization. I couldn’t think of the right words to capture that, but Dave Beckett sat down next to me and after I explained what I was looking for suggested something I liked:

Web 3.0 is to Web 2.0 what the Web is to Walled Gardens.

This morning I thought of another angle, which I also like:

Web 3.0 is to Web 2.0 what freeways are to footpaths.

Get it? On Web 1.0 and Web 2.0, humans “walk” everywhere on the web, strolling from website to website, going about their business. Web 3.0 will allow you to zoom across sites (gathering the data you want) at inhuman (machine) speeds.

That’s not how I normally think of the web’s future, and it’s not a very pleasant view, but it may be accurate.

More broadly, what is Web 3.0?

My sense from this meeting is simple:

The good news is that Web 3.0 is the Semantic Web.

The bad news is that we still don’t know what the Semantic Web is.

That is, the long standing issues in the Semantic Web community are comparably rife within the nascent Web 3.0 community. Is it about machine learning? Is it about formal logic? Is it about sharing data? Is it about searching the natural language web?

My take, quite plainly, is that it may be about all of these, but the core is RDF-style data sharing. Natural language processing has a place in generating RDF. Formal logic has a place in helping us work with RDF data. But at heart, the core thing we need to do is share the data.

(This is my inaugural post on my new WordPress blog. I decided I wanted one using off-the-shelf tech, separate from W3C and MIT. Any bets how often I’ll post?)

[eta testing link to