Skip to content

Wiki update – lost content and l10n

This week I answered a question about lost content on the new wiki after the migration, from a contributor concerned about content randomly lost across the wiki and no one knowing it is gone. His suggestion was programmatic, that is, fix the migration script for where it dropped content and do some kind of re-import and merge. While I would love someone to solve the current situation that way, I don’t see it happening unless somebody new does the work from Mike McGrath‘s foundation.

The situation is, there are some locations in the wiki with content that got squashed when injected in to MediaWiki. One that we found out about afterward is that many comment blocks in preformat blocks (<pre/>) were lost. There are likely to be other situations like that. This is the reason we maintain the old wiki location, to give the power to fix content problems directly to people working on pages.

Theoretically, we could have had 100% content fidelity with the migration. This is the old 80/20 problem of, “When is it good enough?” That is, when have you achieved the 80% goal and need to move on. Given the people who participated in the pre-migration, which included as many subProject and SIG leaders as could be found, we hoped to get a good enough situation that was as close to an accurate 80%, rather than a best-guess. Given that every problem raised by this group of contributors was fixed in the migration script, or otherwise planned for, there is not a whole lot more we could have done outside of giving all contributors a go, and dragging the migration deeper into the Fedora 10 cycle. It should be noted that Mike did a stellar job; he found no existing migration script that was worth the effort, so he had to work one up from scratch. It was a Herculean task, and not being demigods, we likely left some manure in the stable.

Ultimately, if we have pages that lost content and no one discovers or complains, I’m OK with that. It’s a forgotten area of the garden, and if no one cares enough about it to find and restore the content, then it is better gone than distracting the wiki gardeners. We’ll find it eventually and update or remove it.

A wiki has many, many advantages, but at its core, it is a manual tool that requires human/manual processes. For example, to mark a page as deprecated, a MediaWiki-style solution is to create a Template:Deprecated that is injected into the page where you put the {{deprecated}} tag. But you have to read the page, manually edit, and paste that tag in there. If you added buttons to the interface for all this … well, you’d still require people to read through, and you’d have a content management system that would effectively remove the open ease of a wiki that makes it popular, useful, … and dangerous.

On another note, Nigel Jones is working on localization (l10n) for the wiki. In the Infrastructure team tracking ticket you can see how we’re adapting to work with MediaWiki. When he is ready, expect something to the translation announcement list and the Fedora Planet when he is ready for further review.

My hope is that we can use Transifex and the Fedora L10n interface to translate the wiki. To do that, we’re going to need someone to do the work to get editable strings out of MediaWiki, in to PO/POT files, and insert them back in to MediaWiki.