April 6, 2012
I am trying to stay far away from the current TAG discussions of httpRange-14 (now just HR14). I did my time, years ago. I came up with the best solution to date: use “303 See Other”. It’s not pretty, but so far it is the best we’ve got.
I gather now the can of worms is open again. I’m not really hungry for worms, but someone mentioned that the reason it’s open again is that use of 303 is just too inefficient. And if that’s the only problem, I think I know the answer.
If a site is doing a lot of redirects, in a consistent pattern, it should publish its rewrite rules, so the clients can do them locally.
Here’s a strawman proposal:
As an example deployment, consider DBPedia. Everything which is the primary subject of a Wikipedia entry has a URL has the form http://dbpedia.org/resource/page_title. When the client does a GET on that URL, it receives a 303 See Other redirect to either http://dbpedia.org/data/page_title or http://dbpedia.org/page/page_title, depending on the requested content type.
So, with this proposal, DBPedia would publish, at http://dbpedia.org/.well-known/rewrite-rules this content:
RewriteRule /resource/(.*) /data/$1 303
This would allow clients to rewrite their /resource/ URLs, fetch the /data/ pages directly, and never going through the 303 redirect dance again.
The content-negotiation issue could be handle by traditional means at the /page/* address. When the requested media type is not a data format, the response could use a Content-Location header, or a 307 Temporary Redirect. The redirect is much less painful here; this is a rare operation compared to the number of operations required when a Semantic Web client fetches all the data about a set of subjects
My biggest worry about this proposal is that RewriteRules are error prone, and if these files get out of date, or the client implementation is buggy, the results would be very hard to debug. I think this could be largely addressed by Web servers generating this resource at runtime, serializing the appropriate parts of the internal data structures they use for rewriting.
This could be useful for the HTML Web, too. I don’t know how common redirects are in normal Web browsing or Web crawling. It’s possible the browser vendors and search engines would appreciate this. Or they might think it’s just Semantic Web wackiness.
So, that’s it. No more performance hit from 303 See Other. Now, can we close up this can of worms?
ETA: dbpedia example. Also clarified the implications for the HTML Web.
April 14, 2011
18 months ago, when Ivan Herman and I began to plan a new RDF Working Group, I posted my RDF 2 Wishlist. Some people complained that the Semantic Web was not ready for anything different; it was still getting used to RDF 1. I clarified that “RDF 2″ would be backward compatible and not break existing system, just like “HTML 5″ isn’t breaking the existing Web. Still, some people prefered the term “RDF 1.1″.
The group just concluded its first face-to-face meeting, and I think it’s now clear we’re just doing maintenance. If we were to do version numbering, it might be called “RDF 1.0.1″. This might just be “RDF Second Edition”. Basically, the changes will be editorial clarifications and bug fixes.
The adventurer in me is disappointed. It’s a bit like opening your birthday present to find nice warm socks, instead of the jet pack you were hoping for.
The most dramatic change the group is likely to make: advise people to stop using xs:string in RDF. Pretty exciting. And, despite unanimous support from the 14 people who expressed an opinion in the meeting, there has now been some strong pushback from people not at the meeting. So I think that’s a pretty good measure of the size change we can make.
As far as new stuff…. we’ll probably come up with some terminology for talking about graphs, and maybe even a syntax which allows people to express information about graphs and subgraphs. But one could easily view that as just properly providing the functionality that RDF reification was supposed to provide. So, again, it’s just a (rather complicated) bug fix. And yes, making Turtle a REC, but it’s already a de facto standard, so (again) not a big deal.
The group also decided, with a bit of disappointment for some, not to actively push for a JSON serialization that appeals to non-RDF-folks. This was something I was interested in (cf JRON) but I agree there’s too much design work to do in a Working Group like this. The door was left open for the group to take it up again, if the right proposal appears.
So, it’s all good. I’m comfortable with all the decisions the group made in the past two days, and I’m really happy to be working with such a great bunch of people. I also had a nice time visiting Amsterdam and taking long walks along the canals. But, one of these days, I want my jet pack.
February 2, 2011
SemanticWeb.com invited people to make video elevator pitches for the Semantic Web, focused on the question “What is the Semantic Web?”. I decided to give it a go.
I’d love to hear comments from folks who share my motivation, trying to solve this ‘every app is a walled garden’ problem.
In case you’re curious, here’s the script I’d written down, which turned out to be wayyyy to long for the elevators in my building, and also too long for me to remember.
Eric Franzon of SemanticWeb.Com invited people to send in an elevator pitch for the Semantic Web. Here’s mine, aimed at a non-technical audience. I’m Sandro Hawke, and I work for W3C at MIT, but this is entirely my own view.
The problem I’m trying to solve comes from the fact that if you want to do something online with other people, your software has to be compatible with theirs. In practice this usually means you all have to use the same software, and that’s a problem. If you want to share photos with a group, and you use facebook, they all have to use facebook. If you use flickr, they all have to use flickr.
It’s like this for nearly every kind of software out there.
The exceptions show what’s possible if we solve this problem. In a few cases, through years of hard work, people have been able to create standards which allow compatible software to be built. We see this with email and we see this with the web. Because of this, email and the Web are everywhere. They permeate our lives and now it’s hard to imagine modern life without them.
In other areas, though, we’re stuck, because we don’t have these standards, and we’re not likely to get them any time soon. So if you want to create, explore, play a game, or generally collaborate with a group of people on line, every person in the group has to use the same software you do. That’s a pain, and it seriously limits how much we can use these systems.
I see the answer in the Semantic Web. I believe the Semantic Web will provide the infrastructure to solve this problem. It’s not ready yet, but when it is, programs will be able to use the Semantic Web to automatically merge data with other programs, making them all — automatically — compatible.
If I were up to doing another take, I’d change the line about the Semantic Web not being much yet. And maybe add a little more detail about how I see it working. I suppose I’d go for this script:
Okay, elevator pitch for the Semantic Web.
What is the Semantic Web?
Well, right now, it’s a set of technologies that are seeing some adoption and can be useful in their own right, but what I want it to become is the way everyone shares their data, the way all software works together.
This is important because every program we use locks us into its own little silo, its own walled garden
For example, imagine I want to share photos with you. If I use facebook, you have to use facebook. If I use flickr, you have to use flicker. And if I want to share with a group, they all have to use the same system
That’s a problem, and I think it’s one the Semantic Web can solve with a mixture of standards, downloadable data mappings, and existing Web technologies.
I’m Sandro Hawke, and I work for W3C at MIT. This has been entirely my own opinion.
(If only I could change the video as easily as that text. Alas, that’s part of the magic of movies.)
So, back to the subject at hand. Who is with me on this?
January 28, 2011
I’m disappointed in the pace of development of the Semantic Web, and I’m optimistic that the Lean Startup ideas can help us move things along faster.
I’ve been a fan of Eric Ries and the Lean Startup ideas for while, but last night I was lucky enough to get to see him speak, and to chat with some other adherents. There are a lot of ideas here, but the bit that jumps out at me today is this, loosely paraphrased:
Reality distortion fields are bad. Instead of using charisma, style, and emotions to motivate your colleagues to act on faith, motivate them with experimental evidence.
I think we have scant evidence that the Semantic Web will work, and that most of us have been working on this as an act of faith. We believe, without solid evidence, that it can work and will be a good thing when it does. You could say we’re operating in an RDF (resource description framework) RDF (reality distortion field).
The Lean Startup methodology says that we should get out of that field as quickly as possible, doing the fastest experiments possible that will teach us what really works and does not work. On faith we can do 5+ year projects, hoping to show something interesting. Instead, we should be doing <3 month projects to test a hypothesis about how this is all going to be useful.
It’s a shame that most of us are funded in ways that don’t support or reward this at all. It’s a shame the research funding agencies operate on such a glacial and massive scale; in many ways they seem geared more towards keeping people busy and employed than actually innovating and producing knowledge for the world.
Below are my notes taken during Eric’s talk. I have not cleaned them up at all, so you can see just how badly my fingers spell “entrepreneur” when my brain has moved on to something else. I believe slides and the talk itself are available on line; it’s a talk he often gives, so if you have the time, watch it instead of just skimming my notes. (eg this one at Stanford.) Someone else with much better formatting and spelling posted their notes from last night’s talk. You probably want to read them instead, and then come back here and share your insights with us.
$$ Thu Jan 27 18:20:41 EST 2011 ((( EricReis 2 yrs ago at Hobies in palo alto, 6 people, first talking about this... silicon valley is parochial, rarely getting out of the bublle. #leanstartup new conversation -- what is entrepreneurship. strsaight from unhear of to overhypes, without people having learned about it. put entreneurship on a more solid footing. What is a startup? A startup is a human institution deseigned to delivera a new product or service under conditions of extreme uncertainty. Nothing to do with size of company, sector of the economigy, or industry. ALL THE BORING STUFF, and how to get better at it. Startup = Experiment Web 2.0 chart --- lots failed at 3 years. they all failed for BAD reasons. and how many really lived up to their potential....???!!! SO FEW. "If you do everything I did, you can fail like I did." We need a giant industrial support group. "Hi, I'm eric, and most of my startups failed." It's all Taylor's fault. :-) father of scientific management. 1911. birth of management "In the past, the man was first. In the future, the system will be first." "Work should be done efficiently" "Work should be divided into tasks" "It's possible to organize craftsmen" Management by exception -- only have them report their exceptions. Now, decomposing work into tasks is 100 years old. Everything in this room was constructed under the supervision of managers trained by Taylor and his disciples. Shadow Beliefs: * We know what customers want (reality distortion field) * We can accurately predict the future just dont believe the hockey stick spreadsheet * Andvancing the plan is progress eg keep everyone busy, write code, do your functional job! -- if we're building something no one wants, is it progress! [[ NO -- real progress is LEARNING ]] The Lean Revolution (Lean Manufacturing) W E Demming, Taiichi Ohno it's not Tim Quality Money -- pick two we can get all three by being customer focused. Agile Development Alas, Agile development comes out big IT departments. works IF you know what the customer really needs. Steve Blank. Customer Development Agile (Product) Development imvu story im networks -- join them all. he wrote this, in 6 months, to ship to customers. 5 years before that. had to pivot to standalone network. GREAT code, but no one wanted it. claim to have learn something -- about to get fired. :-) learning is a 4 letter word in management -- bad plan -- fired -- failure to execute -- fired Ask yourself: IF my goal was to learn this, could I have done this without writing the code? YES --- just make the landing page!!! As an entrepreneur, you NOT LONGER HAVE a FUNCTIONAL DISCINPINE. you do whatever you need to to get there Entreprenursip is management + OUR GOAL is to create an institution, not just a product * traditona lamangement preactices fail. (mba) * nee entrepeurial managemt -- working under extreme uncertainyu The Pivot --- SUDDENLY overhyped. YOU MUST be able to do this. The successes can do this. They can find the good ideas from the bad, inside the distortion field. SPEED WINS. how many pivots you have left. if we can reduce the time betwwen pivts, we can increas the odds of our success. BUILD -> MEASURE -> LEARN startup= turns ideas into code IF YOU DIDNT BUILD ANYTHING, you cant pivot IF YOU DIDNT TEST IT WITH CUSTOMERS, you cant pivot MINIMIZE total time through the loop. Cycle time dominates. gnl mgmt is about efficiency, not cycle time. GOING THROUGH THE LOOP -- thats how you you settle arguments between founders. How much design -- a reasonable balance. FIVE principals. -- entreps are everywhere anywhere we seek out uncertainty which is everywhere, given uncertainty from IT rev. -- entrp is mgmt -- validated learning -- innovation accounting normally just compliance reporting but: drive accountablity -- hold mgrs accountable GM = "std volume" to compute how many cars each division is expected to sell. allows gm to give bonuses. NOT good for entreps. "success theater" (cumulative total registrations Heh.) ACTIONABLE METRICS, per customer, NOT VANITY METRICS. facebook per-customer behaviors were exciting. Customers were heavily engaged in voluntary exchange with company. And very viral. NEED an accounting paradigm for entrps to prove they've done validated learning. so you never take credit for random stuff, but only take credit for what you derve it for. -- build measure learn HOW do we know when to pivot? as if it were obvious when there's a failure. land of the living dead. persevere straight into the ground. Right answer: (acocunting) pattern, like in science, when the experiments are no longer very productive. If When we can't move the needle very much. Vision, Strategy, or Product - what makes a great company? 500 Auto companies before Ford!!! they didnt have the right process. Vision doesnt change. it's about changing the world. Strategy is how to build a business around that. product dev == optimization pivot is changing strategy, not vision. THERE is not testing The Vision. We're NOT trying to elimintate vision. What should we measure? How do products grow? Entrp. accounting Are we creating value? What's in the MVP? - should a feature be in or out? Out. Can we go faster? NEW BOOK. ================================================================ lean.st/LeanStartupBos startuplessonslearned $$ Thu Jan 27 19:53:19 EST 2011 How do you keep engineers having faith in the process given MVP. How to manage engineers under uncertainty? 1. Keep them calm. Heads down, cranking out code. [reeks of frd taylor.] 2. Enlist all functions in process of discovering if on right track. ABANDON Reality Distorion Field. People will be way more creative if they know what's going on. The truth will set you free. ================ Q: Newbie: use of the term "movement" Eric: I dont want other people to to be doing this. Eric: I used to be a coder. What do I do now??? there IS something going on worldwide. this is science, not religion. lets be careful. if it works for entrepreneur, it's part of Lean Startup. We're learning a lot over the past two years. The movement is not me -- the movement is you guys testing these ideas, in changing the world. this is NOT about proprietary advantage. Eric used to think the right way to change the world was get the VCs to evangelize. Sooooo dumb of me. vc: "im not that interested in improving the world, just my profolio" But now, we should do science. If we all do it, we'll all improve the world and live in a better world. Q: how to test ideas people are not searching for. eg dropbox -- no one knows they want it. If customers dont know they have the problem and know the name of it, you have to find a new way. at imvu, people didnt know it "outbound is the new inbound" we did ad compaigns, $5/day, buying keywords of every adjacent product, "*" + chat. And drove people to our landing page. We wanted to learning the differeences in convertion between these channels. dropbox's MVP was a video, aimed at DIGG users. Drove people to waiting list, beta users. justin tv sl conf video ================ MBAs. how much do MBAs need to re-learn? er: I'm doing entreprenur in residence at harvard business school. but why waste time with MBAs? "what do people say about us when we're not in the room?" MBAs have one big advantage: very process and discipline oriented. if you dont have some failures, yo dont learn. you need to be able to tell what change to the product/market caused the numbers to change. IT HAS TO BE A VALID SCIENTIFIC EXPERIMENT. ================ Some new stuff: the right things to measure are clear and consistent across all startups. 1. value test -- do you know it creates values 2. working enging of growth. two feedback loops: -- eg loop in cylendar engine, and driver-and-surroundings write down how to get to work == taylor plan three engines of growth -- paid. you make a $1 per customer, and they cost $0.50 to buy (have to be able to buy customers) -- viral. as a necessary consequence of customer using it, they get their friends to use it. "someone has tagged you in a photo" you HAVE to click on that. even some fashion busineses. they "grow themselves" bye xploiting bug in human naturo -- sticky, engagment. addictive, network effects, lockin, ebay, wotw, compounding interest. so small viral can compound it, if sticky enough. ================ easily replicated product, get to market first? fear: someone will steal my idea. So: take your second best idea, and try to get someone to steal it. TRY to get them to steal your idea. PEOPLE dont steal ideas. IT SEEMS NUTS, BUT ITS TRUE. You need a good idea. Threat by big company to clone you -- they poached a co-founder -- came out with exacty product two years delayed. $100m failure for them. FIRST mover advantage is very rare in reality. (!!!) ================ one person from each company. how to get whole company to buy in? people often say: that's a really great idea for someone else to do. the issue is the WORK is a system. Your company is a perfect robust system, stable. very very hard to change -- must be planned carefully. try to find one area where there is painful uncertainty, and say there is a community of people trying science to solve this problem. every nods at maximize speed through the loop. BUT if you do that, you will MAKE PEOPLE FEEL INEFFICNENT. People will be interrrupted to do things "not their job". There will be a team powwow where people say they hate it making them less efficient. NEED people bought into theory -- understanfing the value of VALIDATED LEARNING. Only do this where you have authority, maybe just yourself. ================ How does Lean Startup affect managment & sales force. Lab Equipment company. --- "sales people are whiners" Really, customers were giving great feedback, and non of that was making it back to management. 4 steps to ephinary -- steve blanks book -- perfect on enterprise sales. YOU CANNOT DELEGATE customer development. founders and senior mgmt have to be in the room with the customers, at least some of the time. Salespeople arent supposed to be good listeners. Mgrs should DO THE SALES THEMSELVES. THe goal is not to make money, it's to get validated learning.... ONCE we understand how to do the sales, THEN give it to the salesfolk, as per Steve Blank. if using sales force, you are doing Paid Engine Of Growth. ================ Q: I have a product that people havnts paid for. mainstream product. personal keepsake. needs to look really good. hard to measure. style counts. what's mvp in that scenario. why cant you do a landing page. keepsake book. goal of MVP --- least amount of work to learn what needs to be learned. such as whether customers will pay. eg get pre-orders. you can always test demand through fake landing pages. "Concierge" from food-on-the-table. did it all by hand, until they figured out what folks REALLY wanted. PEOPLE will not truthfully answer what they would do. "Would you buy this" turns out to be TERRIBLE DATA. **** YOU ALWAYS CAN TEST requires VERY difficult risking rejections. CUSTOMER SERVICE HURTS!!! * Eric *HATEs* customer feedback * does he really want to know what you thought about today's talk? No!!! ================ When collective feedback, its NOT ABOUT YOU, it's about the person giving it to you. "product is okay" means "product sucks but I'm polite" ================ very very hard. but rewards are emense. think about all the people utterly wasting their time. let's redeem them; make it happen!. $$ Thu Jan 27 20:35:27 EST 2011 )))
November 10, 2010
I propose that we designate a certain subset of the RDF model as “Simplified RDF” and standardize a method of encoding full RDF in Simplified RDF. The subset I have in mind is exactly the subset used by Facebook’s Open Graph Protocol (OGP), and my proposed encoding technique is relatively straightforward.
I’ve been mulling over this approach for a few months, and I’m fairly confident it will work, but I don’t claim to have all the details perfect yet. Comments and discussion are quite welcome, on this posting or on the firstname.lastname@example.org mailing list. This discussion, I’m afraid, is going to be heavily steeped in RDF tech; simplified RDF will be useful for people who don’t know all the details of RDF, but this discussion probably wont be.
My motivation comes from several directions, including OGP. With OGP, Facebook has motivated a huge number of Web sites to add RDFa markup to their pages. But the RDF they’ve added is quite constrained, and is not practically interoperable with the rest of the Semantic Web, because it uses simplified RDF. One could argue that Facebook made a mistake here, that they should be requiring full “normal” RDF, but my feeling is their engineering decisions were correct, that this extreme degree of simplification is necessary to get any reasonable uptake.
I also think simplified RDF will play well with JSON developers. JRON is pretty simple, but simplified RDF would allow it to be simpler still. Or, rather, it would mean folks using JRON could limit themselves to an even smaller number of “easy steps” (about three, depending on how open design issues are resolved).
Cutting Out All The Confusing Stuff
Simplified RDF makes the following radical restrictions to the RDF model and to deployment practice:
The subject URIs are always web page addresses. The content-negotiation hack for “hash” URIs and the 303-see-other hack for “slash” URIs are both avoided.
(Open issue: are html fragment URIs okay? Not in OGP, but I think it will be okay and useful.)
The values of the properties (the “object” components of the RDF triples) are always strings. No datatype information is provided in the data, and object references are done by just putting the object URI into the string, instead of making it a normal URI-label node.
(Open issue: what about language tags? I think RDFa will provide this for free in OGP, if the html has a language tag.)
(Open issue: what about multi-valued (repeated) properties? Are they just repeated, or are the multiple values packing into the string, perhaps? OGP has multiple administrators listed as “USER_ID1,USER_ID2″. JSON lists are another factor here.)
At first inspection this reduction appears to remove so much from RDF as to make it essentally useless. Our beloved RDF has been blown into a hundred pieces and scattered to the wind. It turns out, however, it still has enough enough magic to reassemble itself (with a little help from its friends, http and rdfs).
This image may give a feeling for the relationship of full RDF and simplified RDF:
Reassembling Full RDF
The basic idea is that given some metadata (mostly: the schema), we can construct a new set of triples in full RDF which convey what the simplified RDF intended. The new set will be distinguished by using different predicates, and the predicates are related by schema information available by dereferencing the predicate URI. The specific relations used, and other schema information, allows us to unambiguously perform the conversion.
For example, og:title is intended to convey the same basic notion as rdfs:label. They are not the same property, though, because og:title is applied to a page about the thing which is being labeled, rather than the thing itself. So rather than saying they are related by owl:equivalentProperty, we say:
og:title srdf:twin rdfs:label.
This ties to them together, saying they are “parallel” or “convertable”, and allowing us to use other information in the schema(s) for og:title and rdfs:label to enable conversion.
The conversion goes something like this:
The subject URLs should usually be taken as pages whose foaf:primaryTopic is the real subject. (Expressing the XFN microformat in RDF provides a gentle introduction to this kind of idea.) That real subject can be identified with a blank node or with a constructed URI using a “thing described by” service such as t-d-b.org. A little more work is needed on how to make such services efficient, but I think the concept is proven. I’d expect facebook to want to run such a service.
In some cases, the subject URL really does identify the intended subject, such as when the triple is giving the license information for the web page itself. These cases can be distinguished in the schema by indicating the simplified RDF property is an IndirectProperty or MetadataProperty.
The object (value) can be reconstructed by looking at the range of the full-RDF twin. For example, given that something has an og:latitude of “37.416343″, og:latitude and example:latitude are twins, and example:latitude has a range of xs:decimal, we can conclude the thing has an example:latitude of “37.416343″^^xs:decimal.
Similarly, the Simplified RDF technique of puting URIs in strings for the object can be undone by know the twin is an ObjectProperty, or has some non-Literal range.
I believe language tagging could also be wrapped into the predicate (like comment_fr, comment_en, comment_jp, etc) if that kind of thing turns out to be necessary, using an OWL 2 range restrictions on the rdf:langRange facet.
So, that’s a rough sketch, and I need to wrap this up. If you’re at ISWC, I’ll be giving a 2 minute lightning talk about this at lunch later today. But if you’ve ready this far, the talk wont say say anything you don’t already know.
FWIW, I believe this is implementable in RIF Core, which would mean data consumers which do RIF Core processing could get this functionality automatically. But since we don’t have any data consumer libraries which do that yet, it’s probably easiest to implement this with normal code for now.
I think this is a fairly urgent topic because of the adoption curve (and energy) on OGP, and because it might possibly inform the design of a standand JSON serialization for RDF, which I’m expecting W3C to work on very soon.
October 17, 2010
Last week, I saw The Social Network. I enjoyed it as a movie (like everyone else, it seems), but it also made me unhappy, because it reminded me what a misdirected force Facebook is in danger of becoming (or already is). As most people realize, Facebook centralizes too much power; unless it changes course, this will be its undoing. I hear cheers from some in the audience, but it’s the users who will suffer along the way.
I’ll start with a quote from writer Jessi Hempel, reviewing the movie from the perspective of someone who claims to knows Mark Zuckerberg personally:
The real-life Zuckerberg was maniacally focused on building a web site that could potentially connect everyone on the planet. As early as 2005, he told me, “It’s a social utility and what makes it work will be ubiquity.” [Fortune]
To a first approximation, that’s the same as my goal of many years: building a system to connect everyone on the planet. But I don’t think it can possibly work if it’s a centralized system, with one organization controlling it to any substantial degree. Facebook may have 500m users, but it’s not going to get to 5b users until it’s a truly decentralized, open platform like the Internet and the Web.
More importantly, it wont get to the point where we can, in good conscience, require or assume our fellow travellers on this planet use it, as we generally can with email, the Web, and the telephone network. Some communities (eg schools) are requiring people use Facebook, and I’m not the only one who finds that scary and offensive.
Of course, Mark Zuckerberg is a smart guy. Wired reports him saying:
I don’t think the world is going to evolve in a way that there is just one big site. I think it is going to be that there are going to be a lot of really great services and we are helping to get it there. I think people are always a little skeptical when something grows to something big, but I think you need to look at what it is doing.
And he’s not the only one. When Google made their first attempt to replace email with Wave, they knew it would have to be decentralized, with them just being one of many equal hubs.
When I was younger, I loved decentralization because it got us out from under the control of authorities I didn’t respect. I think that particular fire may have gone out for me, but I still see the need: if we’re going to build the kind of universally shared apps the planet needs (and Facebook dreams of), they have to be built on an open, decentralized platform. Otherwise there is no way they’ll be able to reach even as far as the Web does now.
In a perfect world, I would now sketch out how to build a decentralized version of Facebook. But I seem to have too much else to do right now. So, at very least, that will have to wait for another day.
I can say that it would be built using linked data. I came to linked data as a good way to build global scale shared/social apps, and I still think it’s the best approach. There are some more details to work out, though. Sadly, I haven’t come across any promising funding or business models to support that work. Decentralized businesses don’t have market lock-in and $100m+ exits.
It may be Diaspora will do it. I’m confident before they get very far they’ll have to re-invent or adopt RDF, and eventually the rest of the SemWeb stack. I haven’t yet looked at their design. It may also be Facebook itself will do it. (The fact that Zuckerberg still controls the company, instead of investors, makes it somewhat more likely.)
I suppose, after saying all this, it’s on me to show how SemWeb technology actually helps. Or is that obvious?
Edited to Add: I got a private question about my claim that facebook can’t scale to 5b users, so let me expand on that a little. I see two things stopping them:
- A branding, and look-and-feel problem. Some people hate facebook, without even knowing why. Some people find the site awkward and difficult. This is going to be true of any site; I think the only way around this is to provide for multiple brands with multiple user interfaces. In theory, facebook could do this themselves, much like car manufacturers have multiple “makes”: Cadillac and Chevy are just product lines from the same company, but people’s feelings are directly mostly at the product line.
- A trust issue. Some communities (including some governments) will, quite rightly, refuse to trust facebook to operate in the way they want. It’s possible facebook can find a way to address this concern as well, with special contracts, and even special data centers. For instance, it wouldn’t be impossible for them to build a facebook cluster for CIA internal use, in a CIA facility, subject to full CIA controls, but still somewhat interoperable with facebook at large. But I wouldn’t hold my breath waiting for that to happen, either. I’m not sure how it will play out when teachers ask their students, and their parents, to use facebook.
So, that’s not an ironclad argument that they can’t grow to 5b, but that’s what I’m thinking.
June 8, 2010
I’m going to try explaining linked data again, tonight, at the Cambridge Semantic Web Gathering. I will attempt to keep it simple, while still covering the important details. We’ll see how it goes.
My slides, for people who want a peek ahead of time, are here.