Yaron’s European Vacation

September 30th, 2010

My wife and I got back a little over a week ago from a nearly three-week vacation through Europe, covering both the South and the North. We took a lot of photos, but sadly I don’t have any of them online, and I wanted to get this post out relatively quickly, so you’ll have to paint pictures in your head using my words (or, you know, do a Google Images search).

The first part of the trip was a week-long yacht trip in Italy, starting in Rome, that’s not quite as luxurious as the word “yacht” might imply but was still pretty cool. It was all pre-arranged, with five passengers on board (the two of us, and friends of ours), two crew and one captain; though some passengers got involved in manning the boat by the end (not me; I was mostly content with sleeping on the deck, punctuated by watching the waves). We ate well on that trip - a lot of really good pasta, and all manner of cheese, and some nice pizza. Being a vegetarian, I missed out on all the interesting meats, like sopressata and wild boar, but you can’t really get enough fresh pasta, I say.

Our boat stopped a few times for swimming breaks in the Mediterranean Sea, which was great for me because I never quite got my “sea legs”, so being out in the actual water was a bit of a relief every time. Our ship had a tendency to attract swarms of jellyfish, or maybe they’re just everywhere in the ocean. I got stung by one, which didn’t hurt any worse than, say, a mosquito bite, but then somehow after that I got a reputation among the people on the boat for being fearless about swimming near them. What can I say; I didn’t see them as a threat - like me, they were in Italy in search of a good meal.

The Italy trip started and ended in Rome, and we got to see a lot of small towns and islands around there that I otherwise probably never would have seen. Mostly we were in places around Tuscany: Siena, Isola del Giglio, Massa Marittima, and some other even smaller towns whose names I can’t remember. We also saw the island of Elba, which I assumed would be desolate because that’s where Napoleon was banished to (for just a year, it turns out), but evidently there’s been some progress in the last 200 years, because it’s now a very touristy resort town, looking like Cannes or Acapulco or some such. I don’t know what it takes to get banished to Elba these days, but it’s worth looking into.

Then it was on to a few days in Madrid, where we stayed at the Room Mate Mario hotel, probably the “chillest” hotel I’ve ever stayed at: bossa nova soundtrack in the lobby, bright plastic decorations in the rooms. I’d stay there again any time.

We saw a flamenco show at the Cardamomo, which I guess has become popular among Americans since getting written about last year in the New York Times. As the NYT noted, it’s “unadorned flamenco” - no ruffles, no hats, just good music and tortured expressions. I thought it was great. We also drove around the city via the Go Cars, which are basically a motorized scooter that shouts out descriptions of the places you’re passing by, and directions to the next stop (it uses GPS to know where you are). Pretty neat, although it took a little while to get used to being pointed at by the locals. And Spanish food, even for a vegetarian, is great - it’s a country with a deep respect for the fried potato and the olive, two foods I really like.

The Netherlands capped off the trip. It may be just me, but two weeks of traveling as a tourist is about the maximum for me before I start to feel a little restless, after having moved along from place to place in search of food and entertainment. But the Netherlands provided a clean break from that, since there was some actual work to be done. First there were three days in the charming college town of Wageningen (the Dutch ‘g’ is pronounced like a guttural ‘ch’, as I found out), where I helped out some people at KeyGene with their Semantic MediaWiki installation. I had known for a while that Semantic MediaWiki has found a nice niche for itself in the biological sciences, which experience a relentless flow of new data, new terminology, new interconnections, all of which semantic wikis are well-suited for; but it was great to finally see, in detail, how that was applied to a very specific usage: in this case, genomics research on different plants. It was really eye-opening and felt great to see; like visiting the British Royal Astronomical Society in the 1700s while they were discovering various stars and planets.

Then it was on to Amsterdam, where we stayed with an old friend of mine from New York who had helpfully moved to Amsterdam about four years ago. There I attended the Semantic MediaWiki Conference (SMWCon), which I plan to write a separate post about, on the WikiWorks blog. We also sampled some of the Amsterdam local culture, both with my friend and with some of the Semantic MediaWiki people, which was great - Amsterdam’s just a cool place to hang out in, undeniably.

And after almost three weeks, we ended up back in New York, tired yet refreshed, and me with about 80 items on my to-do list. Well, now I can check off one more.

News from the exciting world of SMW

February 12th, 2010

Some random Semantic MediaWiki-based news, that I haven’t gotten to because I’ve been away from my blog… future updates like this will probably show up on the WikiWorks blog instead. So what does that leave for this blog? Who knows.

I’m back

February 11th, 2010

Somehow I’ve left this blog languish for over two months, which is much longer than I meant to. Somehow other things kept taking priority…

What have I been up to since my last post? A bunch of random stuff: I flew to Shanghai and San Francisco, and my wife and I went to Vancouver, and upstate New York (we also had our honeymoon before my previous post, which I meant to write about but then never did - it was great. We also have lots of photos… argh.) We celebrated the new decade with drinks and Beatles Rock Band. I released new versions of most of my MediaWiki extensions, plus Semantic Bundle.

We had a meeting of our New York MediaWiki users meetup. I created the Referata FAQ and Semantic MediaWiki FAQ. I answered lots and lots of questions about Referata, Semantic MediaWiki, and my other software (sadly, not everything was covered in the FAQs). And, maybe most importantly, WikiWorks has started taking off - we’re already working on our first projects, we have more that are still under discussion, and we have a service agreement contract, just like a real company would… also, as of a few days ago we have a blog and a Twitter account. Please follow, or comment, or whatever it is the kids are doing these days.

SMW Camp thoughts

December 2nd, 2009

It’s been a hectic last month; we got back from our honeymoon last week, and I’m about to go travelling again, and I don’t know when or if I’ll have time to talk about all of it (I definitely hope to); but I did want to write about the amazing weekend we had of the SMW Camp we had in Karlsruhe, Germany. The conference was sponsored by Ontoprise, and took place in and around the Ontoprise office; mostly in a beautiful, loft-like glass-walled meeting room (here’s a representative photo, and here’s one of me talking). In the introduction, Daniel Hansch from Ontoprise, who served as the main host of the event, described Karlsruhe as “the capital of the Semantic Web”, which is only somewhat of an exaggeration.

About 40 people (that’s my guess) showed up, mostly from Germany but also from the Netherlands, Belgium and the U.S. The attendees came from what I consider the four branches of the Semantic MediaWiki world: academics, business people, hackers, and the smallest branch, government people (represented here by someone from the German Air Force). And the talks represented an equally broad spectrum, covering subjects from software tutorials to use cases of SMW to business methodologies to marketing. It felt like a real conference, with two solid days of talks and in-depth discussions.

Some lessons I think could be learned from the whole thing:

  • There’s not much need for the really introductory stuff. We thought that a lot of the crowd might be newcomers to the SMW “community”, and a lot of them were, but almost everyone knew about and had used SMW to some extent. In the future, it might make sense to cut out the introductory tutorials altogether.
  • On that note - there’s no shortage of things to discuss. We had two full days of talks, from 10 AM to 6 PM (actually 7 PM on Sunday), and even then some talks were rushed; and there was still subject matter we wanted to get to but couldn’t, like discussions about interface. Two full days is easily achievable.
  • Semantic MediaWiki isn’t ready for a true “unconference” yet. The name “SMW Camp” was chosen in part because it was supposed to reflect an “unconference” type of event organization: discussion topics decided at the event itself, with multiple tracks so that people can stick to the topics they’re interested in. But it turns out that, in the Semantic MediaWiki world everyone’s pretty much interested in everything: people want to hear about academic research, ongoing development, performance issues, and corporate usage. We did have parallel tracks for one session, and even that led to some complaints from attendees that they couldn’t hear about topics that they were interested in. So it looks like for the foreseeable future we’ll stick with one track. There’s also the issue of a pre-defined schedule versus an open one decided on the same day. What seems to work for SMW is having participants put down the names of presentations they’d like to give, on the meeting’s wiki page, in the weeks beforehand; organizers can then construct a schedule from that just a day or two beforehand. It’s a semi-structured approach that somehow seems appropriate for an event that relates to semantic wikis.I still like the name “SMW Camp” for the meetings, by the way, because it seems more descriptive than any of the alternatives; and because it’s distinctive enough that it’s easy to tell which events are official components of the series, unlike with, say, the name “SMW Meeting”.
  • SMW Camp could probably become a bigger production. The “users meeting” we had in Boston last year was free; and I lobbied hard to make this one free as well - we compromised on 15 euros. Then I found out that many of the attendees were surprised that the price was so low, when conferences that they considered to be of comparable quality routinely cost anywhere from $100 to thousands of dollars (that’s the general rate; speakers sometimes attend for free, and students usually get a deep discount). It’s clear that many of the attendees, at least the non-student ones, would be willing to pay a higher amount. That by itself isn’t reason enough to raise the rate, fo course, but extra money could help in a few ways: catering of events (though going out to restaurants was nice); free swag, like stickers, t-shrits and pens, that some people talked about having; and maybe even some extra money for the hosts, the organizers, people travelling from long distances, SMW developers… there’s no shortage of people one could give extra money to if one had it. :)Tied in with that is the idea of getting corporate or university sponsorship of the event, which, now that we have a proven track record, might be easier to do.

One interesting related thought is that I was trying right after the event to recall how this whole idea of SMW users meetings came to be, when I realized that the person indirectly responsible for them is Sergey Chernyshev; which is ironic because he has yet to attend one. But during the winter of 2008, he kept pestering me to start a New York MediaWiki meetup (he finally started one himself, a few months ago, which we now co-run, in theory). I kept demurring, saying I was only interested in something directly related to SMW, so he said to try organizing something like that instead. With that encouragement, in October 2008 I sent out an email asking if anyone would be interested in a New York users meeting; to my surprise, the interested responses all came from Boston, Seattle and Germany; some of them had previously talked about having an international meeting, but it hadn’t yet coalesced. The Boston meeting, which was hosted by eMonitor (now LeveragePoint) and which we jointly put together, happened quite quickly; literally a month later. So you can thank Sergey for helping to bring about a meeting he’s never attended, with people he doesn’t know. :)

Announcing WikiWorks

November 5th, 2009

I’m thrilled to announce WikiWorks - a MediaWiki-focused consulting company that I just launched. This is my first serious business venture, unless you count Referata. But it doesn’t feel like a huge leap into the unknown, because consulting is already what I do - I’ve done at least some paid MediaWiki work for dozens of sites and companies over the last few years. The difference now is the additional people - WikiWorks is a samll team of programmers around the world, all with significant experience setting up (and, in some cases, developing) MediaWiki; the goal is to make myself expendable, as it were, so projects can run smoothly even if I, or any other one person, can’t work on them at the time. We’re automating the process. Most of us also have other jobs at the moment, but these kinds of projects can almost always be done on a part-time basis, during off-hours; and in-depth projects involving full-time work, should they come, will be handleable in one way or another. The focus is on Semantic MediaWiki-based solutions, though we’re also equipped to take on regular, non-semantic projects.

So - if you’re from a company that would like to set up a wiki the right way, send us an email. If your company has a need for an easily-configured but powerful data integration system, and you would prefer software that’s free to something that costs a million dollars, send us an email. If you have too many Excel spreadsheets flying around the office, send us an email. If you already run a MediaWiki-based wiki, but want to make it nicer-looking, more user-friendly, and more like a true database application, send us an email. We’re looking forward to making some wikis.

Software, West Coast-style

November 3rd, 2009

I had an action-packed trip to California about a week ago. First was the 2009 Google Summer of Code Mentor Summit, which turned out to essentially be an open-source development conference, sponsored in an extremely generous way by Google. It took place at the Google campus, AKA the “Googleplex”, which I saw a long time ago back when it was the SGI campus, but now looks rather different. What can I say - for all the talk of cutbacks, it looks like Googlers still have it pretty good. The cafeteria food was so good, it made me just want to stay in the cafeteria all day.

The conference itself was quite interesting. I especially liked the talks about the non-development aspects of open source software, like the discussions on
marketing and inter-project communication (I wrote the notes for both of those sessions, which I don’t think is a coincidence because I was interested in those topics to begin with). It was eye-opening to see that every open-source project, even the established ones with foundations and business models and lots of users (all categories that potentially describe both Wikimedia and MediaWiki) struggle with the same issues of gaining “buzz” and coordinating decisions that regular software companies, for better or worse, have professionals handling.

I also got to spend with my brother and his wonderful family. And yes, I did go to this party, which was awesome (it was essentially a party full of people at various software startups, which you would never, ever see in New York); and, separately, I went to this great vegetarian restaurant as well.
After the weekend, it was time to head to the new MediaWiki office in San Francisco, where I met for two days with the members of the Wikipedia Usability Initiative. We had some very interesting and fruitful discussions, all on the subject of the template forms project, which is what I’m involved with. Lots of discussions about naming, which is always trickier than it seems!

In what really is a coincidence, earlier today I released the TemplateInfo extension, which is the first draft of my section of the work for the template-forms project. Hopefully it’ll end up on a gigantic website before too long.

Update: Oops, I forgot to post a link to the photo of all GSoC Mentor Summit attendees. Can you spot me? Hint: I’m in the back row, right next to the tree, in a blue hoodie.

Gone till December

October 23rd, 2009

Things have been busy lately, of course, and it looks they’ll stay busy for the next month and a half… in the interests of keeping people informed, and in lieu of continuous Twitter feeds, I figured I’d share my upcoming plans:

  • This weekend and part of next week, I’ll be in California for the Google Summer of Code Mentors’ Summit, and to visit the Wikimedia Foundation people again.
  • While there, I may or may not also be attending this party.
  • The weekend after that is… Halloween.
  • The weekend after that, my lovely wife and I will be jetting off to Karlsruhe, Germany for SMW Camp 2009.
  • After that’s over, we’ll be flying to a few cities in Southern Europe for a few weeks for our honeymoon. European vacay - champagne and cigarettes!
  • Then it’s time for Thanksgiving.
  • A week and a half after that, I’ll most likely be flying to Shanghai to talk about Semantic MediaWiki at the Asian Semantic Web Conference.

I’m looking forward to the honeymoon, but otherwise I don’t know how well this new role of international jet-setter fits me… hopefully 2010 will be calmer all around.

Wedding memories

October 20th, 2009
Once again, I’m extremely delinquent with my posting… I should have written something right away after the wedding, but somehow I’ve let over two weeks pass. Anyway, yes, I’m a married man now! So that’s Mr. Yaron Koren to you. People say you don’t feel much different after you get married, but that hasn’t been quite true for me: I feel a big sense of relief now, and I guess a sort of inner calm. I’ve had a tendency over the last few years, after having started working for myself, to put everything into to-do lists, as a way of trying to manage all the chaos; and this was one big item on life’s to-do list, with recently many sub-items as the wedding got closer, and now it’s checked off. The wedding went great; everyone seemed to have a really good time. Unfortunately, there’s no central place for photos yet; there are some scattered among different Facebook galleries, only some of which I remember how to access; plus some in various emails. Anyway, here are some somewhat-grainy highlights, before the photos from the professional photographer show up:

Exchanging some vows!

Doing our first dance!

Cutting the cake!

Going to the (reception hall) of love

October 2nd, 2009

I had delayed writing about this for a fairly long time, partly because I decided a while ago that this blog was going to be strictly about technical and semi-technical issues, and I stopped writing about personal stuff, pop culture things, etc. After a while I started feeling guilty that I hadn’t written about it yet, which just made it hard to write something about it, thinking about how I’d have to explain my delay in writing about it, which just made the situation worse, etc. etc.

Anyway… I’m extremely pleased to say that I’m getting married in two days. (!!) My lovely bride, who for now prefers to remain anonymous on the internet, is named Lee (that’s actually her nickname, but it’s what everyone calls her), and I’ve known her for four years, and she’s the light of my life.

For an only-adequate, somewhat-too-Photoshopped photo of the two of us, but the only reasonable one I could find on this short notice, see here.

Forms coming to Wikipedia?

September 24th, 2009

I’m doing some part-time work for the Wikimedia Foundation now, on the usability project; you can see the first fruits of my labor here - a proposal for template-based forms on Wikipedia (this, I should note carefully, would not be using Semantic Forms). And you can see the spirited, if mostly tangential, discussion about it on the Wikipedia developers mailing list here.

Wikimania 2009 notes

September 15th, 2009

This email summarizes all the technical/Semantic MediaWiki parts of Wikimania 2009, in Buenos Aires, Argentina. Other highlights:

- getting to see Buenos Aires (and historic Colonia, Uruguay, just a ferry ride away). Buenos Aires is a beautiful city, with a nice-looking bridge; it looks quite a bit like a European city, but with much more political graffiti.

- seeing a keynote speech by Richard Stallman, the open-source pioneer, in which he both generic viagra generic cialis viagra levitra buy cialis viagra professional alienated and entertained the audience with his petulant attitude. Among other complaints, he was upset that Wikipedia doesn’t refer to Linux as “GNU/Linux”. See here for more than you’d care to really know about the whole issue.

- seeing all the talks getting translated into Spanish or English by in-person headset translators, which was pretty amazing; it felt like being at the UN.

- on that note, listening to some talks in Spanish, and being pleased to see that I could understand them without the headset translation. Although it helped that I knew the subject matter intimately ahead of time; I still can’t follow the telenovelas on Univision to save my life.

- the post-Wikimania party. There are some good dancers among the greater MediaWiki development community! I’m not naming any names, though.

Announcing Semantic Internal Objects

August 20th, 2009

My latest extension: Semantic Internal Objects; this is either number 10 or 12, depending on how you count it; which is hard to believe. What is Semantic Internal Objects? In short, it lets you encode compound information, or what’s sometimes known as “n-ary relations“, within Semantic MediaWiki. If you want to record that, say, someone is president of a country, you can do that easily with SMW. But if you want to record that that person was president from a certain year to a certain other year, that hasn’t been possible in SMW until now, because it can’t be represented as a simple relationship (okay, actually, it has been possible, through multi-value properties, but I don’t consider those an ideal solution for various reasons). Semantic Internal Objects (SIO), in short, lets you do that, using a new parser function. I’m very excited about this extension; I think it’ll open up a lot of possibilities for various SMW-based websites, but we’ll see…

Health care opinion: the breakdown

August 13th, 2009

Discourse DB has a topic page for the current American health-care bill, AKA the “America’s Affordable Health Choices Act of 2009″. 82 columns and editorials on the subject are already entered, subdivided into “For”, “Against” and “Mixed” on the topic. If you’re curious about the current distribution of thinking among the pundit class, check it out. (And if you know of opinion items in notable sources not already included in the list, please add them!)

Read “Lecturing Birds on Flying”

August 5th, 2009

Everyone should go out and read Pablo Triana’s new book, “Lecturing Birds on Flying: Can Mathematical Theories Destroy the Financial Markets?“. And I’m not just saying that because I’m quoted in it.

Okay, I am just saying that because I’m quoted in it. But - I’m quoted in it! The quote comes early on, on page 9, and it’s a paragraph from a blog post I wrote two years ago, in a review of Nassim Taleb’s “The Black Swan“, which also contained some thoughts about my old website, Betocracy, plus prediction markets and the “wisdom of crowds” theory.  Dr. Taleb linked the blog post soon after I wrote it, on a page on his website that now appears to have been removed. That’s almost certainly where Triana read the post from, since he’s a devoted follower of Taleb’s.

I bought the book and read it, and it’s interesting - it’s an ode, in the manner of Taleb’s “The Black Swan” and “Fooled By Randomness”, to common sense in finance and a deep skepticism of “experts” who claim to have mastered the markets. Triana has the advantage of writing post-financial-crash, when the idea that the large banks were playing a con game has become standard opinion, right or wrong. He argues for it forcefully, with a focus on financial math, stating that mathematical formulations like the Black-Scholes formula and the concept of “value at risk” (VaR) are flawed and have provided cover for brazen financial gambles. More interestingly, he argues that Black-Scholes, though it’s taught as a basic rule of finance, is never actually used in the banking world; instead, traders make intuitive purchasing decisions that they then justify as some “fudge factor” on top of the supposedly set-in-stone Black-Scholes, using concepts like the “volatility smile”.

Anyway, you should all read it.

Interestingly, I found out about the book when I was called several months ago by the guy who recorded the audio version, to find out how to pronounce my name; among the stranger phone conversations I’ve had. And I guess this now means a bunch of people have now heard my name as well; you can buy the audio version here, by the way, all 16 hours of it.

Semantic MediaWiki updates

July 31st, 2009
  • I was at the “NYC wiki-conference 2009“, held on the NYU campus, over the weekend; my thoughts about the conference are here. The one thing I forgot to mention, on a technical note, was a five-minute demo by Tom Maaswinkel, showing a MediaWiki wiki being edited via the soon-to-be-released Google Wave - it wowed the audience, as Google Wave demos tend to always do.
  • Jeroen De Dauw released version 0.2 of Maps and Semantic Maps. These new versions have, among other improvements, support for Yahoo! geocoding, and just better-looking code, which is going to be important in the long run, as other developers get their hands on it and start tinkering with the code.
  • I added Maps and Semantic Maps to Referata - Semantic Google Maps will be gone shortly. That means mapping on Referata has a lot more options, and it’s already starting to bear fruit - check out the Google Earth option on Food Finds, for instance. Pretty nice!
  • Sergey Chernyshev and I released a new version of Semantic Bundle, which now includes Maps and Semantic Maps, replacing Google Geocoder and SGM. It’s really the beginning of the end for SGM, not counting the 30+ wikis it’s already on…
  • While working on the new Semantic Bundle version, I had the thought that SMW is starting to feel like a mature technology; in that it seems like the majority of the features that it will eventually have are already in place. The addition of the Semantic Maps extension had a lot to do with it, I think; this was one of the big chunks that I thought was still missing. There are still things left to be done, of course; I have a list of around 30, though they won’t necessarily be features that I implement. And I’m sure there will be various improvements behind the scenes, to speed up queries and the like. But I really feel like the Semantic MediaWiki system of the future won’t look all that different from what it looks like now, with the interplay of categories, templates, forms, properties, External Data calls, tables, maps, calendars, widgets, etc. (whew!) that you can already find in various SMW-based wikis. Though I could be wrong about this.

For Semantic MediaWiki, it’s a mappy day

July 23rd, 2009

I’ve been working with Jeroen De Dauw, a student in the Google Summer of Code, on creating a full-scale mapping interface for Semantic MediaWiki for a few months now; by which I mean that he’s done the actual work, and I’ve been around to answer questions and try to bask in the glory. Anyway, I think mapping is crucial for any generic data project, because so much information that we need on a daily basis is location-based, whether it’s information about businesses, people, events, etc. There’s already an extension that handles all this stuff - Semantic Google Maps - but it’s incomplete, first because it relies on Google Maps, which not everyone can use, second because it doesn’t support the incredible Google Earth, and third because it can’t handle displaying locations on non-geographic surfaces (more on that later). Another extension, Semantic Layers, also exists, which uses the open-source OpenLayers mapping service, but it’s had some problems since the beginning that were never fully resolved,

Anyway, yesterday and the day before, Jeroen released the two extensions that he’s been working on, that are meant to provide the generic solution for all of SMW’s mapping needs: they are the Maps and Semantic Maps extensions. Here’s how the two work together: Maps handles the display of individual points, along with geocoding (determining the coordinates of a specific address); and Semantic Maps handles the display of multiple points on a map, defined via Semantic MediaWiki, as well as providing maps as Semantic Forms form inputs. Both support the same mapping services, currently three: Google Maps, Yahoo! Maps and OpenLayers.

Jeroen has been keeping track of all the progress on his blog, which has a lot of information on all of this stuff, including some great screenshots, including this rather breathtaking one of Google Earth being used as a form input.

There’s still a month left in the Google Summer of Code, and Jeroen and I are excited about the extra cushion of time that provides, because it means that there’s an opportunity to add extra features to the system; like being able to show a clickable list of points near each map, so that maps can work more like this; and being able to use OpenLayers to display locations on non-geographic surfaces, such as images. That second one opens up a lot of possibilities, because it allows for things like annotated anatomical charts (see here for an example, from the Semantic Layers wiki) and displaying points on floorplans (see here for an example from the same wiki). For the latter, the example provided is for a video game, although you could easily imagine the same concept being used for more practical purposes, such as displaying events at a conference, or… showing the locations of enemy combatants in a building (hey, I’m allowed to fantasize a little, right?).

By a stroke of good timing, on Saturday I’ll actually be speaking at the New York City wiki-conference (basically a smaller-scale version of Wikimania), on the subject of all this mapping stuff; and hopefully being able to do a Steve-Jobs-at-Macworld thing, where I demo a recently unveiled technology to the crowd. Here’s a link to the panel I’ll be on: “Mapping in MediaWiki”. It’s free to attend, if anyone’s interested.

External Data grows again

June 23rd, 2009

The latest version of the External Data extension now lets you get data from two other sources (in addition to APIs and text files): LDAP servers, and database tables. This is a nice step forward, in that it’s no longer completely necessary to create an API for every data source you want to access from the wiki; which makes the concept of using MediaWiki for data integration potentially simpler and less breakable. Thanks to David Macdonald for this new functionality.

Semantic Bundle launched

June 22nd, 2009

Announcing Semantic Bundle - a single downloadable file that holds Semantic MediaWiki and 16 other MediaWiki extensions that use it and/or are often used in conjunction with it. The aim is to simplify the confusing landscape of extensions that’s evolved around Semantic MediaWiki, so that users can just get one file instead of having to research and download many files individually to get all the functionality they would want. What we have is a basic super-set of the kinds of extensions people usually end up using on SMW-driven wikis (administrators can choose which of the extensions to include, once they’ve downloaded the bundle.)

Semantic Bundle is similar to the SMW+ package distributed by Ontoprise, although it’s a different set of extensions; both include SMW, of course, but other than that the number of extensions they have in common is surprisingly small - which just goes how to show diverse the set of features has become, and may be another argument for this kind of “curatorial” work.

Semantic Bundle was developed, and is distributed, by Sergey Chernyshev and me.

Meeting Metaweb

June 17th, 2009

I had a very interesting meeting about a week and a half ago with Robert Cook, the co-founder of Metaweb, i.e. the people behind Freebase. By sheer coincidence, we know someone (non-technical) in common, and he was visiting New York, so it all worked out. I certainly learned a good amount. For one thing, it was a pleasant surprise to find out that he’s a very friendly and personable guy. The meeting also cleared up some misconceptions I had had about Freebase, and their future plans. I had always thought of Freebase and Semantic MediaWiki as rivals - friendly rivals, perhaps, but still creators of similar products, possibly competing for some of the same customers. And if Wikipedia ever started using SMW, I imagined we’d become pretty much direct competitors, since the other co-founder of Metaweb, Danny Hillis, has referred to Freebase as “Wikipedia for data”. But it turned out that, far from fearing or being skeptical Wikipedia adopting Semantic MediaWiki, Robert was very excited about the idea, and wanted to know what he could do to help. As I found out, Metaweb sees Freebase more as an aggregator of data than an original source of it (that’s my understanding, anyway). In other words, though users can directly add information to Freebase through the form interface, the much more important source is sites like Wikipedia, MusicBrainz, EDGAR, etc. Freebase’s strengths lie in matching up entities (i.e., knowing that data about a book from two different databases are about the same book), as well as querying and browsing - they have an extremely fast storage and querying system for their millions of items of data, and some slick interfaces for browsing through it all (see Parallax). So a two-part solution suggests itself: Wikipedia, with some sort of semantic capability, handles the entry and display of data, along with basic aggregation, like lists and tables (and possibly maps and timelines, etc.); while Freebase takes in the data, then handles the complex browsing and querying that Wikipedia probably couldn’t allow, for performance reasons. Other sites could allow for querying and browsing of Wikipedia’s data as well, of course, but Freebase looks like they’re in a unique position to handle it all.

There’s also Freebase’s entity match-up, which is at the heart of Freebase’s new Common Tag effort. The idea is to, instead of using plain text tags for blog posts, news articles, etc., use Freebase entity IDs instead - so that there won’t be ambiguity about what a tag means. It’ll be interesting if this initiative takes off - as Robert noted, it’s not a substitute for true semantic triples, but it beats having “an ambiguous relationship to an ambiguous entity” (my recollection of how he described current tags).

SMW helps win contests, UPDATE: I can’t read very well

June 9th, 2009

Okay, all of the stuff I wrote before happened, but it was this time last year, not this year. I was off by an entire year. It’s still cool, though - maybe more impressive, actually, given how much functionality has been added to Semantic MediaWiki, etc. since last year. Anyway, what’s written below is not timely in the least.


This is cool. The company 23andMe creates reports for people on their genetic profiles - it doesn’t send anyone their entire DNA chain, but just notifies about the presence of SNPs (”snips”), which, as I understand it, are DNA sequences considered specifically informative. (The company’s also known for being founded by Google co-founder Sergey Brin’s wife, but I digress.) Anyway, in April they ran a contest in which they published the 23andMe data for an anonymous woman, and those who took part had to guess at as many of her attributes as possible. The winner was announced three weeks ago, and it was Mike Cariaso, whom I always enjoy talking to, and who runs the site SNPedia.com (”snipedia”). In his winning entry, he gave details for her race, hair and eye color, proclivity for diseases, and more intangible things like personality and intelligence. In their announcement of the winner, the company didn’t say which of the details were accurate, but if even half of them are, it’s a surprising (to me) level of detail.

In any case, the really neat thing is that Mike used SNPedia as the database to get all this information; and SNPedia is a wiki that runs on Semantic MediaWiki, and Semantic Forms. So I think it’s great proof that SMW can compete with any technology out there at the moment as far as enabling open, collaborative databases.(Oh, and the prize is a free genetic screening, which sounds good if you’re into that sort of thing.)