Tuesday, August 25, 2009

Blocking Procrastination

Dealing with e-procrastination issues or cyberslacking?

In the hopes of helping others, I'm going to admit that I deal with this problem sometimes... unfortunately when it's most important for me to focus. I found one helpful solution that may sound silly to those of you with rock-hard willpower, but for people like me, it's an important aid and reminder to get back to the task at hand. It's called "Simpleblock" and it's a Firefox extension that allows you to be your own "NetNanny". During thesis writing, I've used it to block Facebook, Google Reader, and Google News from my browser. Very helpful! You just have to resist turning it off.

Other focus-enhancing tools I've found include:
Good luck to everyone facing dissertation writing!

Sunday, August 23, 2009

Spam Contact Requests from Skype

Skype has been plagued over the years with the problem of spam 'contact requests'. It's a way for 'sexy girls' to get around your privacy settings and still contact you. While I have my privacy set to allow contacts only from people in my contact list, there is no way to block contact requests, in in reality I wouldn't want to block contact requests anyway since sometime legitimate people do contact me.

So, several times a day I have to acknowledge, decline, and block requests like this:

The Skype forum indicates they are working on it, with 14+ pages of user complaints here and here. Many posters in this forum claim that it's a difficult problem and that users should just disable contact request notifications. I don't think it's a difficult issue, and I also don't want to miss out on legitimate contact requests.

What I don't understand is why this is such a problem for Skype (and Twitter), but not for other services. I don't receive spam friend requests on Facebook, MSN, or Google Talk. There seem to be a simple ways to fix this. Off the top of my head, using social network metrics:

[1] detect new accounts that send contact requests to > N (say, 100) people
[2] if within those N people, the interlinkage ratio is low (e.g. fewer than 2% are connected to one another), flag the account as spam and cancel all those contact requests, block the IP.

This might work because if someone populates their initial contact list for Skype from something like their email database, it's likely that within that email database there are people who are also mutually connected. Whereas a spammer choosing random names or using an alphabetic list will likely request contacts with people who for the large part do not know one another. This would be suspicious.

To augment this, what about some regular expression processing to detect suggestive words in the contact request?

Another option, which LinkedIn uses is to request you to enter an email address of the person you are trying to contact, to confirm that you really know them. I realize Skype intends to facilitate introductions of like-minded people who don't otherwise know one another, but I have never used it for this purpose and I don't intend to. Some sort of captcha/secret question should be an option in the privacy settings so that only requests from people who know me make it through.

I'm not sure how MSN/Google etc. manage these sorts of spam contact/link requests, but they do a good job.

Wednesday, August 12, 2009

The Enter Key is Disabled (for our Super-Secret Reasons)

It's a sad day when the IEEE website makes it into my collection of 'infovis and usability pitfalls'. I've been saving a collection to use as icebreakers at the beginning of HCI classes I will teach. Today's instalment, courtesy of my own professional organization, is an arbitrary warning to 'please click the image' instead of pressing enter.

Issues with this of course include the fact that pressing [Enter] in search forms is the standard, assumed behaviour. Secondarily, there is the annoying popup box -- obviously the system knew I pressed [Enter], why give me an irritating message instead of just activating the darn search? Finally, this message ("Enter key is disabled, please click on the image to submit information") is infuriating as it offers no logical explanation as to why the key is disabled, making it seem like an arbitrary inconvenience. It also is not grammatical and does not end with any punctuation, making sticklers like me cringe. HCI 101: Write clear error messages.

These sorts of inconveniences seem minor on an individual basis, but I'd love to see the logs and know how many times a day this message is displayed. It's not even like this will be an easy-to-learn interaction technique, as it is (a) much slower than pressing [Enter] and (b) goes against the accepted conventions (my 'mental model' of how the web works).

It's shameful that an organization that hosts conferences on usability would create an interface like this. What's worse, I had to use Internet Explorer even to get this error message! In Firefox, my preferred browser, pressing [Enter] simply reloads the page with no apparent effect. No error, no search results. Just a reload. Hmmm, isn't the IEEE an advocate for web standards?

Wednesday, August 5, 2009

Library and Archives Canada Sells Out Canadian Students to US Publisher

I've just realized that Library and Archives Canada, the provider of the repository of Canadian theses, has outsourced the work of scanning and publishing theses to the US academic publisher ProQuest. It seems I'm way behind the times as this has been happening for years. I have some problems with this maddening situation:

  • ProQuest sells the theses and keeps the royalties (this was agreed to by the CFS in 2002!)
  • There is a 'minimum' 6 month delay for a thesis to appear online, but probably 4 years
  • I have to pay to have ProQuest micofiche my thesis (Microfiche? What the hell is that?)
  • "Space is limited in ProQuest's database. Therefore, when writing your abstract, make sure that you don't exceed 150 words for masters theses and 350 words for doctoral dissertations" -- what century are we in? Arbitrary restrictions like 350 words for my doctoral dissertation abstract because ProQuest has a faulty database?
  • I can purchase my thesis from ProQuest at a discount later (thanks!)
This system seems very antiquated. My dissertation will be in full colour with lots of images -- why would I want anyone to receive a reprint from a scanned hard copy? While I know that microfiche is arguably a more stable archival format than electronic files such as PDF, there is a lot of work right now in digital preservation. I think we are far enough along to assume that a PDF can be an archival document. My thesis will be available as a PDF on my personal website for free in any case.

I investigated alternatives -- answer: there are none. Apparently Library and Archives Canada can accept electronic versions directly, skipping the ProQuest step. Luckily, U of T is switching to electronic dissertations as of August 31, 2009. However, they just told me on the phone that ProQuest is still involved even with electronic theses. This is confusing, as the LAC site seems to suggest otherwise.

I asked if I could opt-out of that, and they said no, but that I should feel some comfort because (a) no one will actually buy my thesis from ProQuest so in reality the effect is negligible and (b) they agree with me and tried to get out of the ProQuest arrangement early but could not. They expect things to change in the next couple of years, including the demise of microfiche.

Well, on principle, that's not much comfort. But, as I told the helpful administrator at SGS, at this point actually graduating trumps my principles regarding personal intellectual property rights.

If only I could write dissertation pages as fast as blog posts.

Sunday, June 21, 2009

Academic Spam

I received this message today, from what seem to be a somewhat sophisticated spammer. I receive academic spam regularly, usually from conferences in Florida that accept computer-written papers. I usually just delete them but this one was interesting. They managed to grab my name, my thesis title, and my institution and fill them into what is certainly a form letter. Note the extra spaces before the commas and the two periods after my thesis title (and the odd use of two closing statements).
"Dear Christopher Collins ,

I am writing on behalf of the International academic publisher, LAP Lambert Academic Publishing AG & Co. KG.

In the course of a research at the Library ofUniversity of Toronto , We came across a reference to your thesis on Head-driven probabilistic parsing for word lattices. .

As we would like to make your work available to a larger audience, I am wondering if you may be interested in publishing your thesis in the form of a printed book.

Your reply including an e-mail address to which I can send an e-mail with further information in an attachment will be greatly appreciated.

I am looking forward to hearing from you.

Sincerely yours,
Kind Regards,

Toolasee Marooodamoothoo
Acquisition Editor"
A quick internet search for "LAP Lambert Academic Publishing" leads to some interesting blog hits of other people challenging this shady practice. Apparently if you follow through with their invitation, the first thing they ask for is your banking information, ostensibly to deposit funds from all those sales. To be fair, from their Amazon.com listings (which I won't link to, they don't deserve the hits), and from many experience reports on blogs, they do actually turn your PDF into a 'book' and send you 5 copies. So, if you want five free copies of your thesis, maybe this is a good scheme.

However, since my thesis is available freely on my website [pdf] in full format. I can't imagine why I would want to sell it on Amazon for an hugely inflated price. It won't be a bestseller. I won't make money. It would just be a rip-off for a couple (at most) of people. I guess now that we have print-on-demand publishing, which is what LAP Lambert uses, anything can be a book. It ushers in a new age of vanity publishing. I can't say I have much respect for people who feel their research gains legitimacy because they put a hard cover on it and charge US$100+. In case my website ever goes offline, my thesis is also freely available forever at the Library of Canada. So, LAP Lambert, no I don't need your services to make my thesis more widely available.

"LAP Lambert Academic Publishing" is the English language division of VDM Verlag. At first I thought 'oh, that must be related to Springer-Verlag', a well-respected academic publishing house. It turns out that 'verlag' means 'publishing house' and they are not affiliated.

[Edit 2015 - an earlier version of this post called out the Acquisition Editor's name as being funny.  This was insensitive and offensive, and I apologize.  Please everyone stop talking about the names of the editors - it is irrelevant to the issue.]

Friday, June 19, 2009

EuroVis 2009 Recap

Over on Infosthetics.com, Petra and I were recently invited to be guest bloggers. I was a bit awestruck to write for Information Aesthetics, I admit.

We have contributed our summary of the EuroVis 2009 conference... check it out!

Thursday, June 18, 2009

Auto-Sliding Scales Look Cool, but Destroy Interpretability

The new Thomson Reuters iPhone app from the folks at Reuters Labs is really great. It has easy and fast access to a huge collection of Reuters information from around the world. Also, you can read news offline, which is helpful for iPod touch owners like me. But, there is one irksome design issue, at least for visualization fans like me.

It has a charting tool to review the history of financial indices. At first glance, it seems really slick. InfoVis for the iPhone! Yay! However, when you actually play with it, you realize that as you scan time, the y-axis is constantly changing. This does not allow for true visual comparison between different time periods. The scale should be set to fit the data for all the available time periods, then panning would not be so disconcerting. An option to zoom in or locally optimize the scale could be provided.

Check out the elastic y-axis on this video. I've noted three different y-values that all appeared at roughly the same height as I scanned the data. Also, the bottom of the y-axis is set to maximize the visual variance in the time period. This means that in the example in the video, the y-axis origin is at about 8,000. This helps a viewer to understand the local variance, but exaggerates the overall impression of variance. Whether this is a good idea depends on the task. I guess most financial analysts are interested in recent history, but it would be nice to be able to zoom out to a zero-origin as a sanity check on relative fluctuations.

Thursday, June 4, 2009

A Response to "Sensemaking ok, but ACTION is what they need (Visuale)"

Enrico Bertini writes the interesting Visuale blog, and recently posted a piece arguing that our research quest for 'Sensemaking' misses the forest for the trees: in the creation and study of analysis processes, we are not actually supporting realistic scenarios where decision support is needed in a timely manner. Specifically, he says "visualization is useless if it doesn't help people take actions". While I don't necessarily agree that all our InfoVis research is barking up the wrong tree, I see his point. Some projects, such as my own Uncertainty Lattices, are specifically designed to help people make fast decisions about data. However, it is true that in the InfoVis, and especially in the sensemaking communities, we seem to focus on process before results.

I see his point in that many of the solutions we develop as researchers are decoupled from actual use. I think Shneiderman & Plaisant addressed this somewhat in their paper on MILCS (longitudinal case studies). The problem is indeed structural: we cannot prove real usefulness without long term deployments, and the incentive for such deployments is low in academia (and, these sorts of experiments are time consuming). We cannot become toolbuilders for business without careful (and publishable) follow up evaluations. So, what is the solution?

I think we could be doing great InfoVis research but also having an impact in the analytics world, especially business analytics. We need to partner more with those real world users of data... I would be elated to see some of the great ideas I see every year at InfoVis and other venues actually become real products. There is a gaping hole between the great research we do and the market.

However, I'm not sure that adding the constraints Enrico mentioned will necessarily lead to a situation of improved design, no matter how much design is improved by explicit constraints. Even a cursory look at the bulk of currently commercially available business analytics tools shows that they would never been acceptable to the 'academic' audience (due to poor information design, layout, and breaking well known constraints about human perception). On top of that, they are almost all ugly.

I recently saw a deployed visual analytic tool using dark blue text on a purple background. It was illegible. But it was deployed and paid for. And, it was working for the customer. I would argue that deployment success and ability to provide insight over exploration is not an indicator of quality design. This is the age old question of the mystery of product adoption by the market. Perhaps it is a factor of providing that immediacy Bertini mentions: the decision support in a short time; the answer rather than a lengthy exploration process. The hated fuel gauges might do that better than my own VisLinks. Great, if we are going for speed and quality of decisions and not depth of insight or potential for discovery. We need to separate the two, as they can't be supported the same way. Sensemaking is not about providing a single answer. That's artificial intelligence, or maybe even 'smart graphics'.

I agree completely on Data Mining vs. Visualization... I would sum it up to say the 'vs.' needs to become '&'. I think the strength for the future lies in closer ties between the two. We have 'data manipulations' as a step in every version of the InfoVis pipeline and in all visual analytics process diagrams, but too often the visualization is actually of some surface data, or the outputs of data mining. I think a closer coupling of the two, bringing vis as a 'box opening' tool for data mining will be important. My own thesis research as been looking at just this for statistical linguistic processes such as translation and information retrieval, and I hope to do more of it in the future.

Tuesday, June 2, 2009

iTunes Annoyances

There are many things about the iTunes interface that are irritating, but by far the most annoying is the fact that it does not maintain a 'live' library -- changes to your music outside iTunes are virtually impossible to propagate back to the library. One solution is to delete and re-scan completely, but then you lose playlists, play counts, etc. Another solution used to be to use the great program iTunes Library Updater, but it does not work with newer versions of iTunes.

I just followed a complicated process outlined on Paul Mayne's blog but it only worked for files that do not have a duplicate. If you delete one copy of a duplicated file, then the Smart Playlist method doesn't see the file as missing, because the duplicate is still there. So, you can't clear up situations like this, where an album was accidentally duplicated, then removed:

To make matters worse, you can sort the iTunes table by any column except the indicator column containing the exclamation point. So, it seems the only solution (at least for Windows users) is to ctrl-click every second song in lists like this. That would be a long and tedious process, prone to accidentally deleting the wrong lines. I guess I'll have to do the total delete-rebuild operation.

There are several possible easy fixes to this:
  1. Live monitoring of music folders, as in Windows Media Player
  2. A "remove missing tracks" button
  3. Allow sort on missing status to put all (!) files in a contiguous list
With mature software like iTunes, I don't understand why this feature has not been created. A simple web search yields many complex workarounds -- obviously it's not just me wanting to do this.

Thursday, April 9, 2009

Note to HP: Canada does not use A4 paper

While I think the ISO 216 system of paper sizing (A4, B1, etc.) is much more sensible than arbitrary "letter", "legal", HP should know that Canada doesn't use it! We use "letter". I'm using the HP 'universal printer driver' at work, and it's helpful features raise an interesting question about interface design.

The driver documentation says that the default paper size is selected based on the locale setting in the OS. Mine is set to "Canada", so my default paper is A4. If this worked, it could be a helpful feature to get people printing quickly. The poor design decision here is to not let anyone override this default. To create a profile with default paper size as "letter", I have to create an entirely new printing profile. That wouldn't be such a big deal, except you can't set a new printing profile as the default. So, the default 'General Everday Printing' with A4 is what I'm stuck with, meaning an extra click to change the paper size every time I have to print something. I guess it's time to switch to the older, non universal, driver.

Or, I guess I could pretend to be in the USA, but then I won't be able to read dates and my spell checker would not like how I spell colour.

Thursday, March 19, 2009


Thanks Adobe Reader! Everything is ok, because I have no choice apparently.

Monday, February 16, 2009

Irony: TSA an "unknown entity"

The security certificate for the contact forms on the TSA website in the US is invalid. Firefox throws up warnings about an unsigned and untrusted certificate. Somehow I find this really funny.

Monday, February 9, 2009

Nutritional Facts Visualizations

I recently saw these new nutritional labels at McDonald's, the first being a McDonald's-only infographic and the second is the traditional nutritional facts label. The infographic is in more prominent positions on the packaging. I'm not sure if this is a legislated change, or voluntary, but whatever it is, bravo!

The design is interesting. I think McDonald's is attempting to make the nutritional value of each measured component clear by providing icons for different nutritional components (building blocks for protein, etc.) and bar charts to compare amounts. However, I can't tell what the marker within the bar graphs represent. The units differ across items, so I think perhaps they are normalized to recommended daily intake (full bar = 100%). But, what value does vertical broken line represent? It can't be 50%, since the 45% Fat bar is well beyond the marker. Perhaps I'm missing something obvious here, but without an explanation, this chart doesn't provide as much information as it could. Ideas?
Posted by Picasa