Wikipedia Loves Art: Lessons Learned Part 3: Almost done

Erin is going to blog tomorrow about her own take on the process and some additional statistics, but here are just a few of the cleanup issues we’ve been dealing with on a pool of 13,000 images.

Machine Tagging, Captioning, Bonus Points (the boring, tedious stuff)

Erin has cleaned the entire pool and scored every entry. In some cases, this meant 3 or more machine tags per clean photo. I’m sure she’ll give you the total tomorrow, but the most basic math would indicate something along the lines of 30,000+ machine tags.  Please keep in mind, that’s 30,000 tags applied by hand to organize and a pool of 13,000 images.  To say that our plans here didn’t scale is putting it a bit mildly. Erin, we all seriously owe you more than one drink.

Institutions are captioning at a pretty solid rate, but this will take some time. We are all trying to do this in spare hours and had hoped captioning would be done by the end of March, but it will take longer. You can query overall progress here (as of now, roughly 1700 of the 6000 clean shots have been captioned) and you can run queries by institution here.  Because of the sheer volume of the 2,690 images shot at this one venue, the MET will be captioning-on-demand as the wiki community decides what it needs from the pool and we are discussing the best way to coordinate that effort.

Cary Bass has approximately 6000 clean shots to go through to assign bonus points. To be fair to all the photographers, he’s being good about stopping when his brain is on overload. We expect the entire process to take him 46 hours, over many sessions. He’s now sorting entries by museum, so hopefully we can announce winners at each institution as he finishes groups rather than waiting for the entire pool. You can chart his progress and see his picks by running this query.

Uploading (more possible snafus)

We are currently facing issues surrounding how the images are going to get uploaded to Wikipedia (not something Erin or I have to do….yipeee).  When we originally set out, institutions and photographers were told they were going to be used to illustrate Wikipedia articles.  The wiki community would like to upload them into Wikimedia Commons which helps them manage assets and makes the images more accessible across the wiki platform,  so they can be cross posted at Wikipedia.  To the wiki peeps, this is six/one-half-dozen/or-the-other and, in reality, this really is splitting hairs, but I wish it was something I had understood better at the start, so we could have more clearly defined it for the participants.  We will be e-mailing participants soon to clarify this issue. At that point they will have the opportunity to leave the project without it affecting their scores or prizes.

Closing Thoughts

So, how did this project go off the rails?  For starters, we jumped in with large project instead of a much smaller one where we could apply the “keep it simple” rule.   More importantly, the entire process was really designed to work with the Wiki community (people we didn’t know yet) to create a project that would engage our existing Flickr community (people we knew very well).  What we found is that the community that we had a lot of experience with was the one that made this a smooth process, but the one we knew less about got us into rougher waters.  When I look back on this project, what rings true in my mind is that all communities are different and when we are designing a project, it’s best to concentrate on perhaps one of those and start small, so you can chart the waters first.  The issues you find might stop you in your tracks or they might help you design something more appropriate.

Page 1 of 2 | Next page