Gintern Balls Logo
 

December 01, 2008

Kaitlin Duck SherwoodProgramming persistence

Warning: this is a long and geeky post.

From time to time in the past few years, I have mentioned that I was a little puzzled as to why more people didn’t render tiles on-the-fly for Google Maps, as I do in my U.S. Census Bureau/Google Maps mashup.

I have reappraised my attitude.  I have been redoing my mapping framework to make it easier to use.  I have reminded myself of all the hurdles I had to overcome, and discovered a number of annoying new ones.

First pass

I originally wrote my mapping framework in an extreme hurry.  It was a term project, and a month before the end of the term, I realized that it would be good for personal reasons to hand it in a week early.  The code functioned well enough to get me an A+, but I cut a huge number of corners.

Language/libraries/database choice

It was very important to minimize risk, so I wrote the framework in C++.  I would have liked to use a scripting language, but I knew that I would need to use a graphics library and a library to interpret shapefiles.  The only ones I found that looked reasonable were C-based libraries (Frank Warmerdam’s Shapelib library andThomas Boutell’s gd library).   I knew it was possible using a tool called SWIG, but I hadn’t ever used SWIG and had heard that it was touchy.  Doing it in C++ was guaranteed to be painful, but I knew what the limits of that pain were.  I didn’t know what the limits of pain of using SWIG would be.

Projection

I also had problems figuring out how to convert from latitude/longitude to pixel coordinates in the Google tile space.  At the time (December 2005), I had a hard time simply finding out what the mathematics of the Mercator transformation were.  (It is easier to find Mercator projection information now.)  I was able to figure out something that worked most of the time, but if you zoomed out past a certain level, there would be a consistent error in the y-coordinates.  The more you zoomed out, the bigger the error.  I’m pretty sure it’s some sort of rounding error.  I looked at it several times, trying to figure out where I could possibly have a roundoff error, but never did figure it out.  I just restricted how far people could zoom out.  (It also took a very long time to render tiles if you were way zoomed out, so it seemed reasonable to restrict it.)

Polygon intersection

I remember that I spent quite a lot of time on my polygon intersection code. I believe that I looked around the Web and didn’t find any helpful code, so developed it from scratch on little sleep. (Remember, I was doing this in a frantic, frantic hurry.) I ended up with eight comparisons that needed to be done for each polygon in the database for every tile. More on this later.

Rendering bug

The version I handed in had a bug where horizontal lines would show up at the bottom of tiles frequently, as you can see in the bottom left tile:

It was pretty obvious that the bug was my fault, as gd is a very mature and well-used graphics library.  My old office partner Carlos Pero had used it way back in 1994 to develop Carlos’ Coloring Book, so it was clear to me that the problem was my fault.

After I handed in my project, I spent quite a lot of time going through my code trying to figure out where the problem was with no luck.  Frustrated, I downloaded and built gd so that I could put breakpoints into the gd code.  Much to my surprise, I discovered that the bug was in the gd library!  I thus had to study and understand the gd code, fix it, report the bug (and patch), and of course blog about it so that other people wouldn’t have the same problem.

Pointing my code to the fixed gd

Then, in order to actually get the fix, I had to figure out how to statically link gd into my binaries. I like my ISP (Dreamhost) and wasn’t particularly interested in changing, but that meant I couldn’t use the system-installed gd libraries.  Statically linking wasn’t a huge deal, but it took me at least several hours to figure out which flag to insert where in my makefile to get it to build statically.  It was just one more thing.

Second pass

I have graduated, but haven’t found a job yet, so I decided to revamp my mapping framework. In addition to the aesthetic joy of making nice clean code:

  • It would be an opportunity to learn and demonstrate competence in another technology.
  • I had ideas for how I could improve the performance by pre-computing some things.
  • With a more flexible framework, I would be able to do some cool new mashups that I figured would get me more exposure, and hence lead to some consulting jobs.

Language/libraries/database choice

Vancouver is a PHP town, so I thought I’d give PHP a shot. I expected that I might have to rewrite my code in C++ eventually, but that I could get the basics of my improved algorithms shaken out first.  (I’m not done yet, but so far, I have been very very pleased with that strategy.)

I also decided to use MySQL.  While the feeling in the GIS community is that the Postgres‘ GIS extensions (PostGIS) are better than the GIS extensions to MySQL, I can’t run Postgres on my ISP, and MySQL is used more than Postgres.

I had installed PHP4 and MySQL 4 on my home computer some time ago, when I was working on Mapeteria.  However, I recently upgraded my home Ubuntu installation to Hardy Heron, and PHP4 was no longer supported.  That meant I need to install a variety of packages, and I went through a process of downloading, trying, discovering I was missing a package, downloading/installing, discovering I was missing a package, lather, rinse, repeat.  I needed to install  mysql-server-5.0,  mysql-client-5.0, php5, php5-mcrypt, php5-cli, php5-gd, libgd2-xpm-dev, php5-mysql, and php5-curl.  I also spent some time trying to figure out why php5 wouldn’t run scripts that were in my cgi-bin directory before realizing/discovering that with mod_php, it was supposed to run from everywhere but the cgi-bin directory.

Note that I could have done all my development on my ISP’s machines, but that seemed clunky.  I knew I’d want to be able to develop offline at some point, so wanted to get it done sooner rather than later.  It’s also a little faster to develop on my local system.

I did a little bit of looking around for a graphics library, but stopped when I found that PHP had hooks to the gd library.  I knew that if gd had not yet incorporated my horizontal lines bug fix, then I might have to drop back to C++ in order to link in “my” gd, but I figured I could worry about that later.

Projection

I made a conscious decision to write my Mercator conversion code from scratch, without looking at my C++ code.  I did this because I didn’t want to be influenced in a way that might lead me to get the same error at way-zoomed-out that I did before.  I was able to equations on the Wikipedia Mercator page for transforming Mercator coordinates to X-Y coordinates, but those equations didn’t give a scale for the X-Y coordinates!  It took some trial and error to that out.

Data

For the initial development, I decided to use country boundaries instead of census tract boundaries. The code wouldn’t care which data it was using, and it would be nice to have tiles that would render faster when way-zoomed-out. I whipped up a script read a KML file with country boundaries (that I got from Valery Hronusov and used in my Mapeteria project) and loaded it into MySQL.  Unfortunately, I had real problems with precision.  I don’t remember whether it was PHP or MySQL, but I kept losing some precision in the latitude and longitude when I read and uploaded it.  I eventually converted to uploading integers that were 1,000,000 times the latitude and longitude, and so had no rounding difficulties.

One thing that helped me enormously when working on the projection algorithm was to gather actual data from Google.  I found a number of places on the Google maps where three territories (e.g. British Columbia, Alberta, and Montana) came together.  I would determine the latitude/longitude of those points, then figure out what the tile coordinates, pixel X, and pixel Y of that point were for various zoom levels.  That let me assemble high-quality test cases, which were absolutely essential in figuring out what the transformation algorithm should be, but it was very slow, boring, and tedious to collect that data.

Polygon intersection

When it came time to implement my polygon bounding box intersection code again, I looked at my old polygon intersection code again, saw that it took eight comparisons, and thought to myself, “That can’t be right!”  Indeed, it took me very little time to come up with a version with only four comparisons, (and was now able to find sources on the Web that describe that algorithm).

Stored procedures

One thing that I saw regularly in employment ads was a request for use of stored procedures, which became available with MySQL 5.  It seemed reasonable that using a stored procedure to calculate the bounding box intersection would be even faster, so I ran some timing tests.  In one, I used PHP to generate a complex WHERE clause string from eight values; in the other, I passed eight values to a stored procedure and used that in the WHERE clause.  Much to my suprise, it took almost 20 times more time to use the stored procedure!  I think I understand why, but it was interesting to discover that it was not always faster.

GIS extensions

My beloved husband had been harping on me to use the built-in GIS extensions.  I had been ignoring him because a large part of the point of this exercise was to learn more about MySQL, including stored procedures, but now that I found that the stored procedure was slow, it was time to time the built-in bounding box intersection routine.  If I stored the bounding box as a POLYGON type instead of as two coordinate pairs, then it took half the time.  Woot!

Rendering

I discovered that despite my having reported the horizontal lines bug fifteen months ago, the gd team hasn’t done anything with it yet.  Needless to say, this means that the version of libgd.a on Dreamhost has the bug in it. I thought about porting back to C++. I figured that porting back would probably take at minimum a week, and would raise the possibility of nasty pointer bugs, so it was worth spending a few days trying to get PHP to use my version of gd.

It is possible to compile your own version of PHP and use it, though it means using the CGI version of PHP instead of mod_php. I looked around for information on how to do that, and found a Dreamhost page on how to do so.. but failed utterly when I followed the directions. I almost gave up at that point, but sent a detailed message to Dreamhost customer support explaining what I was trying to do, why, and what was blocking me. On US Thanksgiving Day, I got a very thoughtful response back from Robert at Dreamhost customer support which pointed me at a different how-to-compile-PHP-on-Dreamhost page that ultimately proved successful.  (This is part of why I like Dreamhost and don’t really want to change ISPs.)

Compiling unfamiliar packages can be a real pain, and this was no different.  The Dreamhost page (on their user-generated wiki) had a few scripts that would do the install/build for me, but they weren’t the whole story.  Each of the scripts downloaded a number of projects (like openSSL, IMAP, CURL, etc) in compressed form, extracted the files, and built them.  The scripts were somewhat fragile — they would just break if something didn’t work right.  They were sometimes opaque — they didn’t always print an error message if something broke.  If there was a problem, they started over from the beginning, removing everything that had been downloaded and extracted.  Something small — like if the mirror site for mcrypt was so busy that the download timed out — would mean starting from scratch.  (I ended up iteratively commenting out large swaths of the scripts so that I wouldn’t have to redo work.)

There was some problem with the IMAP build having to do with SSL.  I finally changed one of the flags so that IMAP built without SSL — figuring that I was unlikely to be using this instance of PHP to do IMAP, let alone IMAP with SSL — but it took several false starts, each taking quite a long time to go through.

Finally, once I got it to build without my custom gd, I tried folding in my gd.  I uploaded my gd/.libs directory, but that wasn’t enough — it wanted the gd.h file.  I suppose I could have tried to figure out what it wanted, where it wanted it, but I figured it would be faster to just re-build gd on my Dreamhost account, then do a make install to some local directory. Uploading my source was fast and the build was slow but straightforward. However, I couldn’t figure out how to specify where the install should go. The makefiles were all autogenerated and very difficult to follow. I tried to figure out where in configure the install directory got set, but that too was hard to decipher. Finally, I just hand-edited the default installation directory. So there. That worked. PHP built!

Unfortunately, it wouldn’t run. It turned out that the installation script had a bug in it:

cp ${INSTALLDIR}/bin/php ${HOME}/${DOMAIN}/cgi-bin/php.cgi

instead of

cp ${INSTALLDIR}/bin/php.cgi ${HOME}/${DOMAIN}/cgi-bin/php.cgi

But finally, after all that, success!

Bottom line

So let me review what it took to get to tile rendering:

  1. Choose a database and figure out how to extract data from it, requiring reading and learning.
  2. Find and load boundary information into the database, requiring trial and error.
  3. Choose a graphics library and figure out how to draw coloured polygons with it, requiring reading and learning.
  4. Gather test cases for converting from latitude/longitude into Google coordinate system, requiring patience.
  5. Figure out how to translate from latitude/longitude pairs into the Google coordinate system, requiring algorithmic skills.
  6. Diagnose and fix a bug in a large-ish C graphics library, requiring skills debugging in C.
  7. Download and install PHP and MySQL, requiring system administration skills.
  8. Figure out how to build a custom PHP, requiring understanding of bash scripts and makefiles.

So now, I guess it isn’t that easy to generate tiles!

Note: there is an entirely different ecosystem for generating tiles, one that comes from the mainline GIS world, one that descends from the ESRI ecosystem. I expect that I could have used PostGIS and GeoTools with uDig look like fine tools, but they are complex tools with many many features.  Had I gone that route, I would have had to wade through a lot of documentation of features I didn’t care about.  (I also would have had to figure out which ISP to move to in order to get Postgres.)  I think that it would have taken me long enough to learn / install that ecosystem’s tools that it wouldn’t have been worth it for the relatively simple things that I needed to do.  Your milage may vary.

by ducky

November 28, 2008

Vincent CheungEncrypted Blog Posts Ver. 2

Over 2 years ago, I developed an encryption system that you could use to encrypt blog posts.

Since then, I have written several encrypted blog posts that were about particularly personal things and I didn't want the general public or certain people reading them. You can ask me for one of the keys if you want, but I can't guarantee that you'll get it :p

I just released a new version of my JavaScript Encryption and Decryption system.

Try it:
Show encrypted text (the decryption key is: password)

The new encryption code is a lot faster than the old version, the webpage decryption code now uses a fancy dialog box to ask you for the key, and I fixed some bugs. The entire process has been greatly simplified and the encryption page now automatically generates code that you can copy and paste into your website.

I'm starting to actually make use of my personal website, VincentCheung.ca, which is where I'm hosting my random side projects. I'm just finishing up the new design on that site in preparation for an upcoming announcement.

by noreply@blogger.com (Vince)

November 27, 2008

Kaitlin Duck SherwoodCost of the bailout

There has been discussion about how the current financial system bailout is the most expensive government program ever, according to numbers from Jim Bianco (as I saw it reported by Barry Ritholtz).

Bianco’s numbers are adjusted for inflation, which is good, but that isn’t a complete picture.  There are an awful lot more Americans now than there used to be. If you look at the bailout in terms of per capita cost or as a percentage of GNP, you’ll see that there were a few other programs that were comparably expensive.  So yeah, it’s bad.  Yeah, it’s a big deal.  But we have seen worse.

Program Inflation-adjusted cost (billions) Cost per capita (thousands) % of GNP
Marshall Plan 115.3 0.78 4.7%
Savings and Loan crisis 256 0.95 1.7%
Moon shot 237 1.2 4.3%
Iraq war 587 2.0 4.4%
Korean war 454 2.9 15.0%
Vietnam war 698 3.4 11.2%
New Deal 500 4.0 55.1%
Current bailout 4616 15.3 33.2%
World War II 3600 26.3 150%
Louisiana Purchase 217 40.9 NA

Notes: It was surprisingly hard to find historical GNP figures. It only started being recorded in 1947, and the sources aren’t always clear if the figures are inflation-adjusted or not. Also, most of these things spanned several years; I picked a year near the middle for the calculations. Bottom line: take the % of GNP numbers with a grain of salt. They are close, but not exact.

I used the Flow of Funds Accounts of the United States for 1947-2007, and the a very poorly annotated list from Duke for the New Deal and WW2 numbers. Sorry, there are no GNP numbers from 1803.

by ducky


Kaitlin Duck SherwoodThe myth of mixed-gender parent superiority

Our Loyal Opposition in the marriage equality fight likes to yammer about how research shows that children do better when they are raised by both of their biological parents.  This is utter hogwash.

The Loyal Opposition uses studies that show that children raised by both of their biological parents do better than those raised by a single parent.  Studies comparing kids raised by a mixed-gender couple compared to those raised by a same-gender couple shows absolutely no difference on many many measures of success and well-being — delinquency, dropout rate, alcoholism, teen pregnancy, drug abuse, etc.  By contrast, the difference between kids of two-parent families was absolutely huge compared to kids from single-parent families on all of the measures of success and well-being.

My source for the research on family structure effect on children’s well-being is an extensive longitudinal literature review that the Santa Clara County Social Services Agency did in 1996, a time when you would think society just might have made life even more difficult for gay and lesbian parents.

The only measure where there was any difference was a very very slight (but statistically significant) difference in sexual experimentation: children of gay/lesbian parents were no more likely to be gay/lesbian themselves, but they were very slightly more likely to experiment with homosexuality a few times.

While I am not familiar with any research on biological vs. non-biological two-parent families, it isn’t relevant.  If there is a kid who needs adoption, their adoptive family won’t be their biological family, regardless of whether they get placed with a straight or homosexual couple.

I don’t know of any research that suggests that children of parents who used donated eggs or sperm are less happy than biological children.  I suppose it could be true, but if it is, The Loyal Opposition should oppose infertility treatments of all kinds.  Somehow I expect they wouldn’t take on that fight.

I know some people who think that gay and lesbian couples shouldn’t adopt because their children would face discrimination.  By that logic, we shouldn’t allow black people to have children in the US; we shouldn’t allow Christians to have children in China.

Even if there were some difference between parenting by gay and lesbian couples and straight couples, that still isn’t an adequate reason to try and block their child-rearing.  That’s a false comparison.  The real comparison that you need to make is between children in foster care and those that get adopted.  I suspect that adopted kids do far better than those that remain in foster care, and there is a surplus of kids to adopt.

While it is true that it is difficult to find healthy white babies to adopt, sadly, there are lots of non-white, non-healthy babies available.  When my husband and I were going through foster parent training, Santa Clara County had seven times as many foster children as they had foster homes.  SEVEN TIMES.  (And you can be sure not every foster home took seven children!)

We should celebrate and encourage gay and lesbian adoption, not hinder it!

by ducky


Kaitlin Duck SherwoodObama’s middle-class values

One of the things I really like about Obama is what, for lack of a better term, I will call middle-class values.  He does things like clean up after himself at an ice cream shop, carry his own luggage (pictures here and here and here) and says he turns off lights and will make his kids do chores in the White House.  I don’t recall ever seeing any of the presidents in my adult lifetime — Reagan, Bush 41, Clinton, or Bush 43 — ever carrying anything, even when they were campaigning.  I suspect that Bush 41 never washed a dish or picked up dog poop — ever.  I can’t imagine that either of the Clintons would do so now.

People in power frequently have other people do mundane things for them.  There is a potential that, by doing things himself, Obama could make himself seem less powerful. Jimmy Carter once spent the night in a private home, and it was reported in all the newspapers that he made the bed himself.  My recollection of that is that people were kind of incredulous at him diminishing himself that way.  However, Jimmy Carter ran with a persona of folksiness.  (He was Jimmy Carter, not James Earl Carter, Jr.)  He had to struggle a little against being perceived as a rube, a southern bumpkin.

I don’t think Obama really risks debasing himself in the public eye by doing mundane things for himself.  In contrast to Jimmy Carter, Obama has a public persona that is a bit cold and standoffish.  He even got attacked for being elitist for a little while.  Doing mundane things for himself counters that perception.

Maybe he carefully does these mundane things for show.  Maybe he’s conscious of it and wants to “keep it real”.  But maybe it’s part of his value system that he is not inherently better than other people, and should play by the same rules as the rest of the world.  (Unlike, say, Arnold Schwarzenegger, who has demonstrated a pattern throughout his life of acting like rules were for other people.)  I hope so.

by ducky

November 26, 2008

Kaitlin Duck SherwoodMarriage equality a threat to men’s self-image?

Salon has an interesting interview with Richard Rodriguez, who says — as I do — that the fight over “protecting traditional marriage” is really about protecting traditional gender roles.  However, he spotted something that I missed: the role of male insecurity.

And the majority of American women are now living alone. We are raising children in America without fathers. I think of Michael Phelps at the Olympics with his mother in the stands. His father was completely absent. He was negligible; no one refers to him, no one noticed his absence.

The possibility that a whole new generation of American males is being raised by women without men is very challenging for the churches. I think they want to reassert some sort of male authority over the order of things. I think the pro-Proposition 8 movement was really galvanized by an insecurity that churches are feeling now with the rise of women.

I have been struck in the past at how when The Loyal Opposition talks about gay and lesbian people adopting, they usually emphasize, “a child needs a mother and a father”.  It’s usually men I see saying this; Rodriguez’ interview makes me think that what they are really saying is, “Men are important!  We are!  We are!  We are!”, trying to convince both us and themselves that it is true.

(By the way, children do just fine with same-gender parents.)

by ducky

November 21, 2008

Kaitlin Duck SherwoodAnti-marriage-equality piece reflects values

I found an anti-marriage-equality piece (via Andrew Sullivan) that was very interesting to me because of how it reflected its values.

I saw a striking example of what Jonathan Haight has found about differences in morality between liberals and conservatives. Haight found that conservatives are more likely to value “moral purity”, which basically says “if it feels icky to me, then it must be morally wrong”.

In his essay, Rod Dreher quotes University of Virginia sociologist James Davison Hunter:

“The momentum is toward experience and emotions and feelings. People are saying, ‘I feel, therefore I am.’ This is how more and more people are deciding what is real and right and true.”

Dreher complains that liberals don’t value that:

You can see this in the remarkable unwillingness of many gay-marriage defenders to grant their opponents any moral standing. To disagree with them is to reveal yourself to be a “bigot” (I heard a married, straight young Republican in Texas use that word to describe those who voted for Prop 8; he was far from the only one). Bigots are by definition people whose prejudices are irrational. Bigots are moral cretins who can’t be talked to, only coerced. One is under no obligation to compromise with a bigot, only to smash him.

I think he’s absolutely right.  Liberals cannot understand the value that “if it feels icky, it must be wrong” (especially if “it” doesn’t feel icky to the liberals).  Furthermore, there is no arguing with such a “moral purity” value.  Joe Liberal cannot reason their way to making Joe Conservative feel less icky; Joe Liberal sees it as non-rational irrational because it is not rational by definition.  It is emotional.

Dreher also laments the loss of the “meaning of marriage”:

Though no consensus on gay marriage now exists, the trend lines are not in traditionalists’ favor, in large part because our culture has lost its understanding of what marriage is for. That is, marriage no longer has a settled meaning beyond a nominalist one: it is a contract formalizing the positive emotions two people (for now) have for one another, and binding them in a legal and social framework.

I, a liberal, read that, and go, “yes, that is exactly what civil marriage is”.  (I even have an old blog posting titled “Civil marriage is a contract“!)

Dreher doesn’t explain what the “meaning of marraige” is, but Andrew Sullivan (who perhaps is more familiar with Dreher’s corpus) says:

Rod longs, as many do, for a return to the days when civil marriage brought with it a whole bundle of collectively-shared, unchallenged, teleological, and largely Judeo-Christian, attributes. Civil marriage once reflected a great deal of cultural and religious assumptions: that women’s role was in the household, deferring to men; that marriage was about procreation, which could not be contracepted; that marriage was always and everywhere for life; that marriage was a central way of celebrating the primacy of male heterosexuality, in which women were deferent, non-heterosexuals rendered invisible and unmentionable, and thus the vexing questions of sexual identity and orientation banished to the catch-all category of sin and otherness, rather than universal human nature.

This is exactly what I was getting at in this post and in the first paragraph of this postMarriage equality is not a threat to traditional marriage.  It is a threat to traditional gender roles.

by ducky

November 20, 2008

Kaitlin Duck SherwoodLOLcats representing the human spirit

A while back, I wrote about LOLcats being a stand-in for ethnic groups, allowing us the humour of shared stereotypes but without having to saddle an ethnic group with those stereotypes.

Jay Dixit has a more expansive, romantic take on it: LOLcats are stand-ins for humans in all their glory and pathos.  By being stand-ins, they are less emotionally dangerous:

By articulating profound feelings through cats and marine mammals speaking garbled English, we’re able to shroud genuine emotions in pseudo-irony — which means those animals can evoke deeper emotions without fear of mockery or cheapness.

I’ll put it more simply: humour is pain at a distance. Using cats (or dogs or walruses) lets us put even more distance between us and the pain. We can thus tolerate situations in LOLcats that would be too painful if it were about humans.

Hmm, I wonder if this is why animated cartoons so frequently starred animals (e.g. Mickey Mouse, Roadrunner, Foghorn Leghorn)…

by ducky


Kaitlin Duck Sherwoodemail tool

I spoke a while back with Jason Gallic, the Product Marketing Manager for Email Center Pro.  They have a product designed for improving email-based customer service, including automatic reply templates.

Fourteen years ago, I got to use a webmail system @ATS, developed for the National Center for Supercomputing Applications by the talented Ben Johnson.  (Ben?  Email me!)  @ATS let you set up filters that would suggest a response if the condition you specified was met.  When you read a message, after the message at the bottom, there would be a few checkboxes next to titles of suggested responses. I had the option of selecting any or none of the checkboxes, then pressing either a “Send as is” button or “Edit response” button.  @ATS would include the responses that I checked, and send/let me edit it.

For example, I had one filter set up to suggest the “Undergraduate admissions” answer if the word “admissions” was in the body of the message.  I had another filter which suggested the “Graduate admissions answer” if the word “admissions” was in the body of the message.  By reading the message, I could sometimes tell if they were interested in graduate or undergraduate admissions, in which case I would click the appropriate box and send it on.  Sometimes I couldn’t tell, so I would click both boxes and send it on.  Sometimes I wanted to add a little extra information that I happened to know — if, for example, they asked about who would be a good advisor for research on hydrogen embrittlement in high-carbon steels — I would check the “graduate admissions” box and add the additional information before sending it on.

I ranted to Jason about how useful auto-suggest is; we’ll see if he manages to get it into his product.

by ducky

November 18, 2008

Ben MaurerAmazon's CloudFront CDN: disappointing

I took a look at CloudFront today. They have really good intentions. The CDN space is quite a mess -- it could easily be a pay-as-you-go, self-service industry. However, players such as Akamai try to make a large profit. The CDN space is especially hard for small sites -- you can't get any reasonable pricing unless you are doing high levels of traffic.

Amazon wants to change all of that. However, I think they made a number of missteps in their initial offering.

  • They aren't using it on amazon.com. They use Level(3)'s CDN! Why should anybody consider using a service Amazon isn't using themselves. This is a chance to prove your CDN in real life.
  • Tiered pricing. In a self-service model, it doesn't make sense to offer different prices for different bandwidth usages. One customer with 100 TB of traffic is the same as 10 customers with 10 TB of traffic.
  • Pay per request. For S3, this made sense. Every request was one disk seek on the servers, and people need to pay for that. However, in a CDN, you are expected to serve from memory. The 1 cent per 10,000 requests effectively adds 6 KB of data to every file. So if you serve a 1 KB file, this increases your cost by 6x. At the very least, the fixed cost per request should be less than that with s3 to account for the lack of disk seeks
  • Lack of peering. Doing a traceroute to cloudfront from a few locations (Carnegie Mellon, colos in New York and LA), it appeared that all of my traffic was going over transit links. In contrast, traffic to amazon.com went over fast and cheap peering links.

I do hope that Amazon fixes up CloudFront. It's a fantastic concept. They have the power to force reason into the market.

by noreply@blogger.com (Ben Maurer)


David AndersonBlack box says yes

This just in: following the discovery of the "diagnostic LED" of my black box, it took mere minutes to home in on the bug and eradicate it.

It's alive! ALIVE I SAY!

Oh, what was the bug? Let's just say that when you check, in the code of a driver, whether you properly told the power management driver to power up the chip you're driving, it would be wise to also check the code of the power management driver to make sure the power-up code is right. Because a chip with no power ain't gonna be driven nowhere.

In other news, powering up random peripherals unrelated to what you want to drive doesn't work either. No, really.

by David Anderson


David AndersonDebugging the NXT startup: a binary printf()

(Warning: very nerdy rant about very geeky topic ahead)

Debugging a NXT that crashes during the bootup sequence is hard. Before the main AVR link comes up, there is no way to even get any sound. I've already done debugging by sound: during the early stages of NxOS a couple of years back, I would debug by playing bytes I wanted to check as morse-code-like dits and daas, one bit at a time, over the brick's speaker. It's extremely basic, but it's how I got the display driver to work.

But debugging a crash before the sound driver is in a working state is hard. You have a large binary black box. Either it boots and the sound driver works, in which case you don't have a problem, or it doesn't and you only get The Beep Of Death, the sound of the coprocessor periodically blipping the speaker to say "Your OS is screwed, I'm not playing any more".

Just now, attempting to debug one such crash, I discovered something interesting. If I initialize the sound controller and start an infinite loop of playing a tone, for some reason the pitch of the Beep Of Death changes by a few kHz for 2 beeps, then returns to its regular pitch.

This gives me a more basic equivalent of the morse code byte "printer": if the tone changes, I know that the brick booted at least up to the point of my infinite loop. If it doesn't, I know it crashed before that point. It's an audio diagnostic LED that tells me either "I managed to initialize the kernel up until this point", or "Nope, the crash occurs before execution gets to the bruteforce sound loop".

Therefore, by moving the sound loop around in the init code, I should be able to zero in on the exact crash site. The initialization black box is no longer completely black. A little information leaks out. Instead of "Everything works/doesn't work", I now have "Everything works/doesn't work up to the following intermediate point of my choosing".

And, sometimes, when debugging embedded systems without proper hardware debugging hardware, that tiny insignificant diagnostic LED is the difference between hope and despair.

by David Anderson

November 17, 2008

Manas TungareSoftware and the Democratization of Production

The availability of consumer software in this century has democratized the production of … well, everything. Parts of the current creative landscape seem no different than Marxist philosophies of workers owning the means of production, with one exception: the workers aren’t doing it for money, they’re doing it for fun.

I recently watched Be Kind, Rewind, that’s what has inspired this post — at least the spark behind it. In the movie, two video store employees recreate popular movies using a video camera when the original tapes get erased by a mysterious magnetic force. Their videos were, of course, of very low production quality, but the general idea was still valid: that amateur-grade equipment is approaching professional grade equipment.

That made me realize how easy (or at least, possible) it is to create movies with affordable software on consumer hardware. When things were still in the analog domain, you would need specialized hardware to be able to shoot on film, capture audio on expensive multi-track recording equipment, and edit it all by splicing film together. Now, all you need is a digital video camera and a general-purpose high-end computer (which, incidentally, can also be used for other tasks, so is cheap.) The barrier to entry for amateur film-makers has almost been removed.

Ditto with music production — it is possible for a musician to set up a studio in his/her basement with cheap equipment that doesn’t cost an arm and a leg. The quality of recordings made with these tools is comparable to what the studios churn out.

Publishing is no longer the domain of the publishing house — Gutenberg’s printing press now inhabits every single computer that has a printer attached to it. High quality design tools and cheap reproduction has made publishers out of everyone: flyers, posters, announcements, articles, books — all of them required professional assistance in the past. Newer genres such as wiki articles, blog posts and Usenet postings have been made possible by the Internet.

An entire “prosumer” grade of still cameras has made its way into the hands of millions of photographers. Shooting digitally has minimized the variable costs associated with photography, thus unshackling the amateur from budgetary constraints that professionals never had to bother about. That brings them one step closer to competing with professionals, e.g. by selling their shots as stock photos online.

As the economy turns digital, distribution is also taken care of democratically. Earlier, it would take a promoter, someone who could invest the initial millions, to take a creation to market. Today, it’s as easy as uploading it to YouTube or selling it on iTunes or printing a book on Lulu or making a shirt at CafePress. If it’s good, it will go viral. Simple as that.

Nowhere is this change more apparent than in the recently-concluded Presidential Election in the United States. Barack Obama is spoken of by many as the first YouTube president. Indeed, the numerous amateur videos posted by his fans to YouTube and Twitter and on their blogs played a major part in spreading the word about his ideas — in a way that pre-Internet generations could never have.

If I were an anthropologist circa 3000 AD, the last three decades would show up as a significant inflection point in a graph of human achievement and creativity.

Here’s to software!

by Manas

November 06, 2008

Kaitlin Duck SherwoodBlogs fomenting partisanship? No, conservatives.

In his post today, Scott Rosenberg suggests that there are people who blame the blogosphere for how intensely nasty and partisan our political world is right now.

Excuse me????

Partisan nastiness has been going on far longer than people have been blogging.  The Web was pretty well unknown during Clinton’s first term, and in its infancy during the second.  I seem to recall a whole lot of partisan bickering back then.

I don’t think that divisiveness is due to the Web, I believe that it’s due to conservatives.

That’s a little hard for me to write because I want to be fair.  But I really think it is true.

I recently read an article on research in morality that points up values differences between liberals and conservatives.  One thing that researchers found was that liberals put a much higher value on fairness than on group loyalty, while conservatives value them about equally.  This research suggests that a liberal is more likely to sacrifice group loyalty in the name of fairness than a conservative, e.g. to help a conservative do the work to send in an absentee ballot.  This research suggests conservative is more likely to toe the party line, even if he/she doesn’t believe in it.

When Palin was insinuating that Barack Obama wasn’t a “real” American, she was exploiting her white audiences’ high value on group loyalty.  By making it look like Obama had a different in-group, Palin made her audience worry that they might end up as the out-group.

(Being a member of an out-group might be particularly scary if you have yourself treated out-groups unfairly.  I’m just sayin’.)

I was totally unconcerned about being in Obama’s out-group.  You would think that I, a 45-year old, hot, white woman with an upper Midwest accent, who lives above the 48th parallel, might identify more strongly with Sarah Palin.  However, I am a liberal, and I believe that Obama is a liberal.  As such, I absolutely believe he will be fair.  I absolutely do not believe that Palin will be fair.  And I think that is part of her appeal to her base.

(P.S.   I was kidding about being hot.)

by ducky

November 05, 2008

Kaitlin Duck SherwoodAre we moving back to the US?

Several people have asked me, “So are you and Jim moving back to California now?”

The answer is “No, not yet.  Maybe never.”

I had six reasons to move to Canada:

  1. I was devastated that my fellow Americans could elect G.W. Bush for a second term.  That said to me that my fellow Americans and I were not at all on the same page, and that maybe I didn’t belong in the US.
  2. I was upset at how my government shredded civil liberties for both citizens (e.g., illegal wiretapping) and non-citizens (e.g., torture and abuse).
  3. I was unnerved by an almost willful neglect/disinterest in some major, fundamental structural problems in the US and Californian economies.  In particular, the US has been, as Lloyd Bentsen famously put it in a 1988 VP debate, been “writing hot checks” for a very long time: spending a lot but not paying enough in taxes to support those costs.
  4. UBC was more nurturing than Stanford, my other choice for grad school.
  5. We have lots of relatives close to Vancouver, just across the border in Bellingham.
  6. Canada’s health system is not tied to employment.  It is highly likely that we will, at some point, earning money but not be employed.  Living in Canada, that’s not a problem.  (Like right now.  I’m looking for work and Jim is consulting.)  Living in the US, that might be a problem.

The fact that my compatriots turned out in such droves for Obama lessens the feeling that I am out of step with the rest of America.  I was shocked and appalled by the divisive tactics used by the McCain/Palin campaign, but enormously heartened at the number of Republicans who have publicly voiced being likewise shocked and appalled.  So Obama’s election knocks off #1 pretty well.

I have finished my graduate degree, so #4 is off the list.

Our families are still in Bellingham.  We could move to Seattle and be slightly closer to our families, but California would be quite a bit farther away.  So #5 favours Vancouver or Seattle, but still disfavours California.

I think Obama will probably make #2 better.  Issuing an executive order banning torture at one minute past noon on Jan 20, 2009 would be a good start, but to see how he does on #2, I’ll have to see him govern.

Likewise, on #3,  I won’t know if he will make things better until I see him govern.  However, it’s not likely that he will be able to avoid “hot checks” in his term because of the horrible horrible financial problems.  He also can’t do much about California’s problems due to Prop 13.

There are more factors to consider now.

Ducky Watching Election Returns

Ducky Watching Election Returns

  1. I like many things about Canada and Vancouver.
    • I have friends here.  (It was really nice to watch the election last night surrounded by a bunch of friends!)
    • It is really cool to live in the heart of downtown.  We are able to walk to everything (so much so that we only use our car about twice per month).
    • I like, in theory, that there is skiing so close.  We have season passes this year to a mountain that we can see from our apartment.  It takes about 30 minutes to drive there.
    • By and large, Canadian government services have far better customer service than in California.  It takes me about twenty minutes to renew my Social Insurance Number (like a Social Security Number in the US).  It took me about fifteen minutes to move my driver’s license to BC.
  2. It is not a perfect fit.
    • In particular, I still have ambitions to change the world, while I think Vancouver puts more value on having fun.  I’m trying to get the “fun” attitude, but it’s swimming upstream for me.  (Hopefully the ski passes this winter will help!)  Silicon Valley is all about changing the world, and so that is a huge magnet attracting me south.
    • I don’t like maple syrup, I have never played hockey, and I thought Anne of Green Gables was a boring book.  I did not spend many years steeped in the Canadian cultural stew, absorbing the Canadian value system, shared experiences, and etiquette.  I will never be fully Canadian. (At the same time, the longer I stay in Canada, the less time I spend in the American cultural stew; the less American I become.)
  3. Somewhat to my surprise, I discovered that I still love my country.
  4. I am growing to love Canada.
  5. I haven’t found a job yet.

So.  Will I return to the ever return to the US?  To California?  I’m not sure.

by ducky


Kaitlin Duck SherwoodProp 8 looks like it will pass

It’s looking like California’s Prop 8 is going to pass, and that’s a very sad thing.

However, I think it was far, FAR more important that Obama get elected than that Prop 8 fail.  If McCain/Palin had won, we would have seen a significant shift in the Supreme Court to the right. We could have kissed goodbye to any hopes of getting marriage equality through the Supreme Court for twenty-five or thirty years.

With Obama in office, it will probably stay roughly the same in liberal/conservative makeup, but get younger.  I expect that we will now see a federal Supreme Court case in five to ten years about marriage equality.  And we will win that one — not just for California, but for everybody.

There is no good legal argument against marriage equality.  Let me repeat that: there is no good legal argument against marriage equality.  The arguments are emotional or religious, not rational.  The rational arguments — the one on which our legal system is founded — say that citizens get equal protection under the law.  It’s in the Constitution.  It’s fundamental to the constitution.  So unless the SCOTUS has people whose judgement is influenced by religion or emotion, we will win that fight.  (This will be especially true after five or ten more years of seeing same-sex marriages function in Massachusetts, New York, and Connecticut, Canada, Netherlands, Belgium, Spain, Israel, and South Africa without destroying the fabric of society.)

So yes, it is disappointing.  It would have been nice to put this issue to rest in California forever.  However, it is not dead.  We will overcome.

(Update:  Andrew Sullivan has a similar post, written with eloquence.)

by ducky



Vincent CheungCongratulations President (Elect) Obama!

After months and months of campaigning, it's finally over! And the US didn't screw things up this time around! I know the polls pretty much had called the election for Obama, but I was still a bit worried that something was gonna go wrong. I was impressed by the polling; they were pretty accurate. I was following FiveThirtyEight and RealClearPolitics quite closely and while each individual poll is not that reliable, they are pretty reliable if you average them all together and weight them appropriately (as 538 did).

Anyways, nothing I can say would be anything more than what has already been said by all the pundits and newscasters. Well, except for the fact that even here in Toronto, people are very interested in the election. After volleyball, I went to the Wheat Sheaf for wings and beer, a pub that seems to have a bit of a sports inclination. Even there, the tv's (maybe 6 - 8 of them) were on CNN for the election coverage and when John McCain was giving his concession speech, the place went quiet.

Congratulations President David Palmer.... err... I mean President Barack Obama ;)

by noreply@blogger.com (Vince)


November 04, 2008

Manas TungareA Heads-Up Display for Social Networks

I often find myself talking to people who I should know (in theory), but for some reason, in practice, my neurons refuse to make the right connections to remember these connections. Wouldn’t it be great if someone designed a heads-up display based on your social network?

This is how it would work: when I activate it, and it notices I’m talking to someone, it would do a quick scan and tell me his/her name. That would be a life-saver, and would avoid the first five minutes of the 20-Questions game I have to play every time this happens (while making sure that the other guy (or girl!) doesn’t notice I’m playing the game in my mind.)

It could also tell me how I know that person, because sometimes I remember the name, but nothing else. Wouldn’t it be helpful to know that I’m talking to John Doe, who went to the same high school as I did, and who is now President and CEO of a Fortune 100 company (note to self: graduate soon.)

Not just names, it could even tell me more about the person I didn’t already know (or, in the more likely case, I’ve forgotten.) I’d love to know that my friend John Doe is no longer with his (now ex-) girlfriend Jane, so that would cut out a lot of awkward conversation. Knowing that he just went on a cruise to Alaska would instantly give us a topic to chat about. Knowing that the lady on his arm is not his wife would probably also help. I could ask him about our common friends and if he were in touch with any of them. And then he could use his heads-up display to pull their details up and tell me what I’d already looked up, but that’s another story.

So why isn’t something like this on the market yet? I’m sure there would be throngs of people lined up outside the offices of the company that makes the first such thing. And if they try to patent it, you can cite my blog post as prior art. You’re welcome. :)

Update: A picture is worth a thousand words. A movie, perhaps a million?

by Manas

November 03, 2008

Kaitlin Duck SherwoodFor Barack Obama

Andrew Sullivan wrote an endorsement of Barack Obama that made me cry.  It wasn’t that his prose was so poetic that I got a form of Stendhal Syndrome.  It wasn’t that he inspired me.  It was that he reminded me, in clear and vivid detail, just how badly Bush messed up the country.  He brought up all of my grief and dismay about — and all of my shame for — my government’s actions.  He reminds me why I left my beautiful country.

One of my best friends is Lebanese.  In about 1996, I asked him why he never talked about Lebanese politics.  Had he just written it all off?  No, he said that it was too painful to talk about.  At the time, I didn’t understand.  Now I do.

My productivity in the past few weeks has gone way down as I continually hunt for more stories about the election.  It’s a destructive, addictive, action.   It’s not like me reading the stories are going to change the outcome of the election. ( I voted several weeks ago, so it’s not like the stories are going to change my vote.) I know that it is pointless to read about the election, but I can’t help it, I must read — because every story that I read about Obama leading gives me a tiny flicker of hope.

I left.  I turned my back and walked away.  So why does it still matter?  The best analogy I can think of is of being in love with an absolutely wonderful man who two or three times per year beats the crap out of me.  I’ve metaphorically walked away and found another — one who is incredibly sweet and nice, but who isn’t as good a fit as my ex.  There is nothing wrong with my new beau, and I admit to a little bit of excitement at something novel.  But the fit isn’t quite right: he puts the toilet paper on backwards, he really likes foods I can’t stand, and he just doesn’t have the same shared context that I do with my ex.  I have to keep explaining things to him that my ex understood right away.  My new beau is certainly a fine and wonderful person, and I could be very content with him for the rest of my life, but there isn’t that same level of passion.  Really I want a reformed version of my ex, one who fits but who doesn’t beat me.

I want Obama to win.  Very much.  I then want him to get my beautiful country out of this mess.  (Er, these messes.)  I want that very, very much.  I’d like to think that someday I might have the option of coming home.

by ducky


Manas TungareEmail should have Expiration Dates

The entire idea behind this blog post has been summed up in the title, so all I need to do now is to explain why I think email should have expiration dates, and how that would make personal information management better.

Email, as we all know, started off as a way of sending short messages to colleagues within a department. It has since evolved into a monster of a tool that does everything it was never designed to do. The paradox is that it is exactly the kinds of messages that email was designed to handle that cause me the most trouble these days.

  1. I often receive email from my friends about meeting up for lunch. This is important, but only for that particular day (and that too, if I receive it before lunch time).
  2. My research collaborators send me email when a paper submission deadline is near, with the draft attached to it. Those emails are not nearly as important after the deadline.
  3. My friends and I exchange travel plans over email, but is it as useful after the trip is done?

These are the kinds of messages I’m talking about: important but time-sensitive. Then there are others which are not really important, but simply one-time notifications that I can take action on and then forget (”bill is due in 2 days”, “X added you as a friend”, “your order was received”, “your package has shipped”, “free donuts in break room”, “we are not meeting today”, etc.)

Why do they linger on in my mailbox for years? They become indistinguishable from the really important email that I need to save for years, such as some very interesting and intelligent discussions I have had with others. Note that I’m not including spam in this discussion, because in my opinion, there are adequate spam-filtering tools circa 2008 that perform well enough for most users for the most part with an acceptable false positive rate. Not perfect, but acceptable.

The Keeping Problem

Email is no longer ephemeral — people hold on to their email for years. This is what results in the Keeping Problem in Personal Information Management: there is so much of information coming at us that we don’t want to spend the time to decide what to keep and what to trash, so we end up keeping all of it. We hope we never have to do spring cleaning, and instead rely on search to find what we want.

Filing is not the answer

Many people file and tag their email, but the question is, is the cost of doing so (time as well as attention) worth the payoff at the end? Consider the two alternatives: spending 10 minutes each day filing your email, versus spending an hour a month looking for that one email. Pretty soon, the second alternative starts looking better while swimming in a sea of email with no signs of abating.

Same needle, bigger haystack

The bigger the haystack grows, the harder it is to find the needle. The solution is to reduce the size of the haystack. Automatically. Most other solutions empower the user to filter, sort, file, tag and do other sorts of things to their email that do not scale very well. That’s where Email Expiration Dates come into play. For it to work, they need to be (1) defined and (2) honored.

Defining an Email Expiration Tag

Email expiration tags can be defined in several ways by several entities that handle the email message at some point of time in transit.

  1. By the sender of that email who cares about the recipients;
  2. By the email client (MUA) used by the sender, automatically inferring from certain common-sense words; e.g. subject contains lunch and body is less than 100 bytes;
  3. By the email server software that intelligently tags email based on common patterns seen across multiple users;
  4. By the recipient’s email client, based on heuristics;
  5. By the recipient’s email client, based on a user-defined rule set;
  6. Or explicitly by the recipient in a spring cleaning session.

Honoring an Email Expiration Tag : Fully standards-compliant

RFC 822 allows custom tags (Sec. 4.7.5). These are commonly referred to as X- headers, since the specification requires that all such tags be prefixed with “X-”. Many applications built on email make use of such tags: mailing lists use the X-List-* headers to specify the list name, subscribe URL and unsubscribe URL in a mail message. Spam filtering software such as SpamAssassin assigns a score to each email, saved as an X- header. Mail clients are free to interpret these tags as they see fit.

An expired email will not be automatically deleted if the user does not want it to be. This is important for archival purposes and to satisfy the stringent reporting requirements of the Sarbanes-Oxley Act. But now the user can make a one-button choice about whether or not expired emails be deleted, archived, moved away or kept around.

With help from legitimate bulk email senders (not spammers)

Bulk mail such as Facebook notifications could have expiration dates set to “one week after receipt”. Bill reminders could set the expiration date to be “2 days past deadline” (and then send another notification if payment is not received by then.) Donut announcements could expire at the end of the day. Talk announcements could expire at the end of the talk.

Fixing the post-vacation blues

Returning from a vacation is no longer refreshing, as we are thinking about the sheer volume of email we need to process once we get home. If I was on vacation when the donuts were on the table, I should not be bothered about it when I return. Go away! If it’s an invitation to a talk that happened while I was away, I don’t need to hear about it now.

What will it take for adoption?

Defining a standard is no use if it isn’t used. The best way for such a solution to be adopted is for a major email provider implement it themselves, perhaps in a limited beta? On the interface side, this requires two additions: one for sending, one for processing received messages. The widget at the sender’s end is simply a calendar picker, or a drop-down with relative dates (”tomorrow”, “next week”, etc.) At the receiving end, it’s a three-way radio button that lets users “Delete”, “Archive” or “Leave alone” expired messages.

Till then, it’s back to manual spring cleaning. Oh well.

Acknowledgments: I have had several stimulating discussions with my advisor, Manuel Pérez-Quiñones, and my colleague, Pardha Pyla, about our respective email filing strategies, (that mostly began as venting sessions). This idea no doubt borrows from my analysis and conclusions based on some of those conversations.

by Manas


Manas TungareThe Case for Decentralized Social Networks

This article was originally written October 3, 2007 and published here before OpenSocial was announced. With this blog post, I’m moving it to my blog to avail of features such as commenting and cross-linking with other related posts. I have not edited it since the original writing; if I do, edits will appear as updates marked as such.

Social networks are currently walled gardens: you need an account on multiple social networking sites to be able to interact with all your friends. This article makes a case for opening up the core protocols that define person-to-person interaction (decentralized networks) and various aspects of your public personality (decentralized applications). It is possible to use a few well-known semantic Web protocols and microformats to break down the walls and make the Internet a true social network.

Social networking Web sites are currently walled gardens. If you’re on MySpace, you cannot communicate with Facebook users or Orkut users. Although the features provided by most sites are comparable, if not equivalent, one must have an account on each of these sites to interact with members from that community.

The Motivation

That is not how social networks work in the real world. I do not need to be a citizen of a country or a follower of a religion to converse with the members of that country or religion, respectively. OK, this is a far-fetched analogy, but consider email.

The Evolution of Email

Before email as we know it today was in wide-spread use, the earliest way to send a message to anyone using a computer was simply to drop a file in their home directory. You could thus only send a message to users of the same computer as you. Ray Tomlinson came up with the idea of addressing users using the “@” sign, so email could be sent across computers. In the opinion of Jon Postel, this was a nice hack that finally evolved to an IETF RFC.

Today, we are able to address email to anyone on any network that’s connected to the Internet. Their ISP, or operating system, or mail transmission agent (MTA) or mail user agent (MUA; commonly referred to as a mail client) has no bearing on whether they will receive our email or not. The diversity in the email ecosystem allows me to receive, download and view my email in exactly the way I want.

Fast forward to Instant Messaging

Instant messaging evolved similarly, with ICQ, AOL, Yahoo!, and Microsoft all developing their functionally-equivalent, but non-interoperable protocols for essentially the same task. You had to have an account with each of those providers to be able to talk to their users. Along came XMPP and Jabber, followed by the development of an IETF RFC for instant messaging, which has now found support in commercial products such as Google Talk. XMPP does not require users to have accounts on multiple servers; if you have an account on one XMPP server, you can chat with any user on any other XMPP network (provided other prerequisites such as authorization are met.)

Why this makes sense for Social Networking

Social networking is no longer one of the fringe activities on the Web. There are several Web sites that purportedly do the same thing (and I’m too lazy to list them all.) The point is, social networking is now becoming a conduit rather than a destination. Much of our online time is spent on social activities, and the importance of individual users and their individual contributions taken together is increasing.

So what would it look like?

A decentralized social network would let users sign up at whichever Host site they prefer (just as you can sign up with any email provider today.) They would be able to participate and interact with users of any other such Host site, with no additional signing up to do. They would be able to create a profile that best reflects their motivation in signing up: a college student may sign up at a Host that allows him to display his classes and academic interests, while a professional may choose a Host site that emphasizes her skills and experience. Applications running on any Host site will have access to the Friends List of the user account they are running under, even across Host sites. A user’s profile may even be fragmented across multiple Hosts, with each Host hosting a particular aspect, or type of content for the user.

Advantages for Users

  1. You don’t have to sign up at multiple sites.
  2. You can choose a Host site that best suits your personality. If you like a casual, “explosion-in-a-media-factory” look, sign up for MySpace. If you prefer a more professional look, go for Facebook. If you want to expose more professional data than personal data, Linked-In is your Host. If you would like to express your affiliation with your company or non-profit, use their Host site as your primary home.
  3. If none of these suit you, just roll out your own Host site that hosts exactly one profile: yours. That won’t prevent you from being part of the larger social network.
  4. Since individual Hosts manage their user’s profiles, privacy can be controlled better. You will retain the choice to pick a service that best matches your privacy expectations.

Advantages for Developers

  1. When a new social networking site announces their own API and protocols, developers won’t have to scamper to port their existing app to it. They simply continue hosting it themselves, and the newcomer Host will simply talk the same language out-of-the-box.
  2. Developers will also have the flexibility of tailoring their interface whichever way they want — they will not be required to adhere to the strict interface guidelines of individual sites.
  3. For those that heavily rely on eye-balls and advertising, they can continue to host their own content, not subject to a third party’s terms of services.
  4. Expert users may choose to develop apps for themselves. These one-offs will be easy to integrate with that particular user’s profile.

Advantages for Hosts

  1. Host sites will be able to position themselves in a market better, and distinguish themselves from other offerings in a better way than existing sites can.
  2. The network effect will no longer be the dominant reason for users picking one social networking site over another, and such sites will have to compete on real features and good design, rather than simply “because all my friends are here”.
  3. Closer ties between users and hosts will enable them to tailor their services to the particular class of users they attract.

A Change in Philosophy

A few things will need to be re-thought, because, in a decentralized network, there are better alternatives to existing ways of doing things.

Disseminating and aggregating specialized content

We have seen specific websites and companies excelling in managing different types of data. Flickr specializes in photos, YouTube in videos, Blogger in blog posts, Twitter in one-line “twits”. All-purpose sharing websites such as Facebook also let you upload and embed all these types of content. What is the use of duplicating this content on each individual social networking site?

In a decentralized network, my blog could stay at Blogger, my photos on Flickr, and my videos on YouTube. My personal profile is simply an aggregation of these multiple aspects of my personality. What’s more, to design my own profile, I could just pick and choose the “modules” I want from a palette of available syndication options. (In fact, my own website is already designed like that: content you see here is aggregated from Twitter, Flickr, and FeedBurner, plus a few hosted pages.)

Developers can concentrate on what they do best, and outsource the rest of it to experts in individual areas. Photo album designers will not have to reinvent Flickr, and video distributors can simply leech YouTube’s bandwidth for their hosting.

Profile information can be mashed up

A user’s profile information can easily be mashed up for quick one-off applications. For example, if I need to create a list of all my friends from a particular group to print greeting cards, I do not need to write an application, submit it to Facebook and wait for their approval. I simply deploy it to my own Host’s server and get done in the time it takes to write “SELECT * FROM Friends WHERE Group = ‘christmas-cards’” (oh, and I would totally pick a host that provides a SQL interface for social data!) I can have an address book that integrates with my web-based email client, that maintains an updated list of email addresses of all my friends, of course pulled from their individual profiles.

Rethinking privacy and authorization

Authentication is easy (we’ll look at that soon.) Authorization is hard. But this is a problem that should be easy for public-key cryptography to solve. I’m not a cryptography expert, so anything I say here will be wrong. But I trust that if the experts put their mind to this, it shouldn’t be too bad to solve without having Alice, Bob and other alphabet-soup-inspired characters to make all their keys public.

Enhanced Search

To some, this may sound like a gross invasion of privacy, but in fact, deciding what information should be public, and making that publicly-accessible information searchable, are two different problems. Privacy gate-keepers at each Host will decide what content to make publicly accessible. Once that decision has been made, all the major search engines can index the public information (without having access to any of the private stuff.) Google made the Web searchable. A search engine for The One Social Network will make the world’s population searchable.

The Ground Work

A quick analysis of what’s required to make this happen makes us realize that much of the groundwork has already been laid.

Representing People and Relationships

The chief contribution of the recent boom in social networking is the recognition of the Person as a first-class entity on the Web. Earlier, the only way to represent a person on the Web was via her home-page. But that, too, was a static representation, largely disconnected from the activities and evolution of that person.

A recent push towards including semantic markup in Web pages has led to the development of microformats, a light-weight method of marking up entities within Web content in terms of loosely defined formats that do not interfere with the already-existing presentation duties of HTML. There is the hCard microformat defined for representing a person. The XHTML Friends Network establishes a format for indicating relationships among individuals on the Web. A lot of users and Host sites have made their pages XFN-Friendly, i.e., they have added semantic markup to the lists of their friends to indicate relationships.

Representing Activities

Blogs and twits have emerged as easy ways for people to broadcast their activities to whoever is ready to lend an interested ear. There already are standards that help people share these activity logs in standard formats: Atom and RSS.

Representing Personal Information

Again, microformats have been defined for such diverse things as user-posted reviews, calendar entries, résumés, addresses, geographical location information, with a whole lot of other discussions in progress. The mother of all social networking artifacts, tagging, has also been microformatized.

Communicating Across Diverse Websites

Many sites these days are opening up their APIs for external applications to access and modify users’ data over the Web. SOAP, XML-RPC and other, more formal protocols have given way to REST (Representational State Transfer) as a light-weight software architecture for distributed systems. With RESTful websites, it is easy for independent applications to modify data stored on servers: examples include Google’s GData APIs for many properties, Flickr’s API for accessing photos and metadata, Twitter’s API for posting twits, and many other services.

Distributed Authentication

Systems such as Open ID are emerging as viable standards for truly distributed authentication and identity management. There is no reason why an OpenID-based system cannot be used for the Network We Talked About. If we throw in the ability for Hosts to share authentication lists, that would make all Hosts available to all Users, and the question of having to “pick” a particular host may be moot.

What’s Missing?

Communication Protocols for Posting Messages Across Hosts

REST is here, but it only defines the transport architecture. A RESTful communication protocol will have to be developed for users to be able to post messages to other users on other Hosts. Nothing monumental, but just one thing that needs to be done.

Representing Groups Across Hosts

Groups of users will need a way to be recognized across Hosts. A simple way of doing this would be a naming scheme that stays unique across the network, much as Usenet groups have been. A lot of the lessons learned from the design of Usenet can be used here, because today’s social networks are much similar to Usenet, with a few other goodies thrown in.

Current Efforts

Although we are far from this vision, some sites (mainly Facebook) seem to have started on this path. The Facebook platform was a unique step in allowing developers to access users’ profile information. Though, Facebook still is a walled garden. In part to increase traffic, they also have taken baby steps in making users’ profiles available to search engines. MySpace profile pages are still very un-crawl-able. Flickr, Upcoming, and other Yahoo! sites use microformats extensively. Facebook provides RSS feeds of user activity.

Although these are steps in the right direction, they are not enough. Hopefully, we will reach a critical mass of social networking sites that adopt an open social network policy. Till then, you can find me at my many online haunts.

by Manas

November 02, 2008

Kaitlin Duck SherwoodLegacy of Proposition 13

I went looking today for my blog posting about Prop 13, and was stunned that I couldn’t find it.  I was sure I had written one, but I guess I hadn’t.

This is a story of unintended consequences of lowering taxes.  As a result of Prop 13, it costs cities more to provide services to residents than it can get from the residents in property taxes. They only make money from business (in sales taxes).   Cities thus work really hard to avoid zoning for residential — especially high-density residential! — and do everything they can to attract businesses  The result is that the cost of housing close to jobs has gone way, way up.

Meanwhile, there is a loophole that lets cities get some money from residents: developers’ fees.  If a big developer wants to put in a subdivision, cities are allowed to charge the developer fees for that.  (I think the fees can be arbitrarily high, but don’t know for sure.)  Some cities can use the fees that developers pay now for future housing to allow them to cover the costs of services to the current residents.  It’s a Ponzi scheme, however: you have to keep building in order for this to work.  Thus this only works for cities that have a lot of land, i.e., ones way out in the exurbs.

This means that the housing is waaay far away from the jobs.  Note that the long-haul roads are paid for with either state money or federal money, so neither the city with the jobs nor the city with the housing cares how much the road costs.  Massive sprawl has ensued.

This means that housing is expensive in general, and even higher in cities close to jobs.  Meanwhile, cities’ revenues increase at a little bit more than 1% per year, but inflation — which influences how much they have to pay — has gone up way faster than that.  In particular, the salaries they need to pay are influenced by the cost of housing, which has skyrocketed.

Meanwhile the state has no money.  It got hammered by the difference between what they took in and what they had to pay out, just like the cities.  Furthermore, in the 1980s, the federal government either took money from the states or stopped giving as much (I forget which) as part of the Reagan Revolution.  Thus the state just took money from the cities.  Just took it.  And because it was bigger, it could.  This contributed to cities’ difficulties.

Because of the sprawl, building public transportation is expensive.  Because the state has no money, it can’t afford to build public transportation.  This means that people have to spend a looooong time in their cars.  This is expensive in time, and more recently, in money.

Because the state has no money, the public education system has gone into the toilet.  California ranks 36th from the top in per-pupil public school expenditures (just below Missouri) and 35th from the top in SAT scores (just below Virginia).  There are only two states that spend less when you adjust for the cost of living.  Because the public school system is so bad, many people send their kids to private schools, which decreases their desire to pay taxes for the public school system.

Because of Prop 13, the citizens of California pay significantly extra housing, schooling, and transportation for the privilege of spending more time in their car.

How do you fix it?  Well, you can’t really raise taxes because the citizens are already getting squeezed — they aren’t going to want to raise taxes.  Furthermore, the ones that already own houses have a strong disincentive to make housing in general more affordable.  They get rewarded if the price of their house goes up.  The only thing I way I can see to fix it is to amend Prop 13 so that over a long period of time, the amount that property taxes can rise loosened — perhaps capped by inflation. I left California because, in part,  I am very pessimistic about the chances of this structural weakness in the California economy changing.

Where did all the money go?  It went to people like my husband: people who bought a house, held it for long enough for it to appreciate, and sold it for a huge profit, and then left the state.  It went to people who owned land and sold it to developers.

by ducky


Kaitlin Duck SherwoodInteresting Vancouver

I got a ticket to Interesting Vancouver from Boris Mann, who uh reminded me that I owed him a recap in exchange.  That’s a perfectly perfectly reasonable request, but that message didn’t sink in ahead of time, so I didn’t take notes or try very hard to remember.  I’ll do a dump on my impressions, but you should note that I seemed to have been grumpy that evening, perhaps because I didn’t have enough dinner.

  • James Sherret from AdHack:  I was about 10 minutes late, so missed his talk completely.
  • Darren Barefoot, laptopbedouin.com: I came in in the middle.  Darren was basically waxing rhapsodic about the value/joy of telecommuting from other countries for multiple months at a time.  His message seemed to be “go, it’ll be fun, you’ll learn, it won’t cost as much as you think, what do you have to lose?”
    • When I was younger, I wouldn’t have found anything at all wrong with that message. I would had little patience for old farts telling me (or anyone) that I should grow up and start being responsible blah blah blah. However, now that I am older and have seen how health issues can trash a life, I would suggest more caution, particularly for people who are citizens of countries without socialized medicine. Part of “being responsible” is saving away the money that you will need for later. When you lose your job. When you can’t work because of your illness. When your partner can’t work. When your mother has a heart attack. When your kid needs rehab. You might be fine now, but someday you won’t be. Running off to live in a third-world country probably increases your risk of illness, complications, accidents, and/or violence at least slightly. It also cuts you off from your family — the same family that you might need to turn to someday.
  • Roy Yen, soomo.com: I think Roy was the one who was arguing that our “vertical architecture” (i.e. skyscrapers) was contributing to loneliness and isolation, and that we really needed a public gathering space in Vancouver.  He said that Vancouver used to have a big public gathering space, but there was a riot in the 70s and The Powers That Be decided that having a big public gathering space was a Bad Idea, so redeveloped it away.  He pointed out that the only public gathering space left is the back side of the Art Gallery, and that the Art Gallery is slated to move to False Creek.
    • When he blamed the high-density development for loneliness and isolation, I was kind of stunned.  “Has he ever lived in suburbia???” I remember asking myself.  I immediately thought of a neighbourhood in Orange County where a friend lived, where I spent a few days once.  It was all snout houses, and my friend said that in a year of living there, she had never spoken to a neighbour.  I think she maybe hadn’t even seen her immediate neighbours — they drove into their garage, and thence immediately into their house.  I am now living in a skyscraper for the first time, and I find the density wonderful: I see people in the elevators, I see people as I walk to the bus stop, I see people as I walk to the grocery store, etc. etc. etc.  It feels far more communal than driving to everywhere in a car by myself.
    • I did think it was interesting to hear about the bygone public space and to think about the back of the Vancouver Art Gallery being the one public gathering space.   However, most of the places that I’ve lived didn’t have public gathering spaces, and somehow we got by.
  • James Chutter radarddb.com:
    • James’ presentation didn’t have a real obvious thesis statement — I don’t know that I learned anything from his talk, but I remember I enjoyed it.  He told the story of his evolution as a storyteller, and in doing so talked about the evolution of the Web.
  • Cheryl Stephens, plainlanguage.com: Cheryl is a lawyer and literacy advocate who talked about how widespread the problem of literacy is.
    • Cheryl lost my attention very, very quickly.  Some combination of her voice level, the microphone level, how far she stood from the mike, and me being in the back of the room (I was late, remember?) meant that I had to exert some effort to understand her words, and I didn’t like her words well enough to pay attention.  In particular, early on, she asserted, “There can be no education without literacy.”  While I  might have been extra grumpy that evening (note my grumpy comments elsewhere), I found that statement offensive.  Um, blind people can’t be educated?  (NB: Only 3% of the visually impaired students at UBC read Braille.  I presume the rest use screen readers.)
    • Later, she talked about how widespread illiteracy was, and said that only about 10% of the population could read at a college level.  I didn’t know about Canada, but right around half of the US population has attended some college.  Um, does that mean that 80% of people in college can’t read at a college level?
    • One thing that I did find interesting was her report that the Canadian Supreme Court ruled that explaining something wasn’t enough, that the plaintiffs also had to understand it.  (She gave the example of someone being offered counsel, and the perp saying he already had a drug counselor — not realizing that “counsel” meant “lawyer”.)
    • I was a little confused as how explaining something verbally related to literacy.
    • At the end, she rushed in about thirty seconds of how to make your prose more understandable.  I personally would have preferred less talk aimed at convincing me literacy was a problem and more on how to address it.
  • Shannon LaBelle, Vancouver Museums: Shannon gave a very quick talk that was basically, “Vancouver has lots of interesting museums, especially the Museum of Anthropology when it reopens, go visit them!”
  • Irwin Oostindie, creativetechnology.org: Irwin talked about his community, the Downtown East Side, and in bringing pride to his community through culture, especially in community-generated media.
    • I wanted to like Irwin’s lofty goals.  He was a very compelling speaker.  But I have done a lot of work in community media, and know that it is extremely difficult to make compelling media.  It sure seemed like he was getting his hopes up awfully high.  Well, best of luck to him.  Maybe.
    • He seemed to want to make DTES a vibrant, interesting, entertaining place.  I worry that if it becomes entertaining, it will quickly gentrify.  I think a lot of people in DTES don’t need entertainment, they need jobs.  They need housing.
  • Jeffrey Ellis, cloudscapecomics.com: Jeffrey gave a very quick talk advertising a group of comic artists who were about to release (just released?) another comic book.
    • Sure, fine.  Whatever.
  • Tom Williams, GiveMeaning: Tom told the story of how he used to be making big bucks in high finance, and thought he was happy until someone he had known before asked him a penetrating question.  I don’t remember the question, but it was something along the lines of “Are you really happy?” or “Does your life have meaning?” and that made him realize he wasn’t happy.  Tom quit his job and went looking for his purpose and couldn’t find it.  He came back to Vancouver, found that guy, and said something along the lines of “You ruined my life with your question, how do I fix it?”  The guy said, “Follow your passion.”  Tom said, “How do I find my passion????”  The guys said, “Follow your tears.”  From that, Tom started a micro-charity site.  (Think microlending, but microgiving instead.)  People can create projects (e.g. “sponsor me for the Breast Cancer Walk”) that others can then donate small amounts to.
    • I don’t mind giving people a little bit of money sometimes, but I do resent being on their mailing list forever and ever after.  When he explained his site, it made me think of all the trees that have died in the service of trying to extract more money from me.  :-(
    • His friend’s advice, “Follow your tears” has hung around with me since.  I told my beloved husband that it probably meant that I had to go back to the US to try to fix the system.  Unfortunately, I find activism really boring:-(
  • Naomi Devine, uvic.commonenergy.org: I don’t have a strong memory of this talk.  I think she was arguing for getting involved in local politics, especially green politics.  I suspect that the talk didn’t register because it either trying to persuade me of something I already believe, or teaching me how to do something I already know.
  • James Glave, glave.com: I have very weak memories of this talk also.  I think again, he was trying to persuade me on something I’m already persuaded on.
  • Colin Keddie, Buckeye Bullet: Colin gave kind of a hit-and-run talk about the Buckeye Bullet, a very very fast experimental car developed at Ohio State University and which runs on fuel cells.
    • I would have liked to have heard more about how the car worked, the challenges that they faced, etc.  However, he only had three minutes, and that’s not a lot of time.
    • (Slightly off-topic: I got to see a talk on winning the DARPA challenge when I was at Google.  It’s a great talk, I highly recommend it.)
  • Joe Solomon, engagejoe.com: I don’t have any memories of this talk.  Maybe I was getting tired then?  Maybe I’m getting tired now?
  • Dave Ng, biotech.ubc.ca: Dave’s talk was on umm science illiteracy.
    • Dave gave a very engaging talk.  He put up three questions, and had us talk to our neighbours to help us decide if they were true or false.  All of them seemed to be designed to be so ridiculous that they couldn’t possibly be true.  I happened to have read Science News for enough years, that I was very confident that the first two were true (which they were).  The third was something about how 46% of Americans believe they are experts in the evolutionary history of a particular type of bird — again it looked like it couldn’t possibly be true.  It was a bit of a fakeout: it turned out that 46% of Americans thought that the Genesis story was literally true.
    • The audience participation was fun.
  • David Young, 2ndglobe.com: David talked about Great Place/times and wondered why Vancouver couldn’t do that.  He pointed out that Athens in Socrates’ time, Florence in Michelangelo’s time, Vienna in Beethoven’s time, the Revolutionary War-era US, and several other place/times had far fewer residents than Vancouver, so why can’t we do the same?
    • I have thought about this, and maybe I read something about this elsewhere, but I believe there are a few factors that account for most of why the great place/times were great:
      • Great wealth (which means lots of leisure time).  Frequently this wealth came by exploiting some other people.  The US Founding Fathers and Athenians had slaves, for example.
      • Lack of entertainment options.  We are less likely to do great things if The Simpsons is on.
      • Lack of historical competition.   Michelangelo showed up at a time when the Church was starting to be a bit freer in what it would tolerate in art.  (Michelangelo’s David was the second nude male sculpture in like 500 years…)
      • Technological advances.  Shakespeare wasn’t competing with hundreds of years of other playwrights, he was competing with around 100 years of post-printing-press playwrights.  The other playwrights and authors’ work didn’t get preserved.   The French impressionists were able to go outside and paint because they were able to purchase tubes of paint that they could take with them and which didn’t dry too fast.  (They also had competition from the camera for reproducing scenes absolutely faithfully, so needed to do something cameras couldn’t do.)

I also had an interesting time talking with Ray-last-name-unknown, who I met at some event a few months ago and who I’d spoken to at length at Third Tuesday just a few nights before.  We walked back to downtown together and didn’t have any dead spots in the conversation.

by ducky

November 01, 2008

David AndersonAirports of the world, take notice

Singapore International Airport rocks.

The shopping and restaurant center in the international corridor is bigger than the main commercial zone of many so-called international airports.

For 5 euros, you can grab a delightful hot shower, sheer nirvana after 10 hours of flying. For 15 euros you can enjoy a day in the ambassador lounge, complete with complimentary refreshments, a complimentary bed to nap, complimentary gym, complimentary showers, and free internet access.

Oh, yeah, the internet access. Get this. Free broadband wifi internet access for the whole airport. Yes, you read that right: airport; wifi; broadband; free. All in the same sentence. Until now I thought airports were a "pick three of these four" deals, but it does appear that at least one airport in the world does get it.

I'm only here for six hours or so until my flight on to Zurich, but I will long remember Singapore International Airport as the first airport that was not only bearable to dwell in for 6 hours, but actually pleasant. And that's just the international corridor, I dare not imagine the awesomeness of the rest of the place. Whoever runs this joint, bravo.

by David Anderson

October 30, 2008

Kaitlin Duck SherwoodCalifornians: please vote NO on Prop 8

Californians:

Please vote NO on Proposition 8 on Tuesday.  It is a bad law that would hurt people.  People I car