← Homepage
[Outreachy] · · 11 min read

Bringing documentation to light

To be honest, before becoming an Outreachy intern at the Wikimedia Foundation, I had never thought about many of the technical aspects of Wikimedia projects. Obviously the work isn’t completed with miracles and magic, but the full complexity and importance of all the work done behind the scenes did not occur to me until I got involved with one of the most important aspects of a free software project: documentation.

!["Saint Jerome", by Caravaggio. The painting depicts Saint Jerome, a Doctor of the Church in Roman Catholicism and a popular subject for painting, even for Caravaggio, who produced other paintings of Jerome in Meditation and engaged in writing. In this image, Jerome is reading intently, an outstretched arm resting with quill. It has been suggested that Jerome is depicted in the act of translating the Vulgate.](/CaravaggioSanGerolamo.jpg) Painting of Saint Jerome, patron saint of translators, by Caravaggio. (Public Domain/[Wikimedia Commons](https://commons.wikimedia.org/wiki/File:Caravaggio_-_San_Gerolamo.jpg))

My role is dedicated to finding strategies to increase the number of people translating user guides. But before exploring possible ways to find new contributors, I needed to answer four questions:

  1. What do we define as a user guide?
  2. Is documentation well written?
  3. Are we capable of welcoming new translators?
  4. What is the current state of user guide translations?

While the answer for the first question might seem obvious for those extremely familiar with how wikis work, it was a source of confusion to me. As I searched for more information on subjects I was struggling with as a translator, I got lost very easily. I eventually ended up with multiple tabs of multiple wikis open, with little idea as to which one I ought to be relying on. But as I learned the conventions behind the organization of wikis, it became clear that what I was looking for was the pages under the Help namespace 1.

As for the state of documentation, the first thing I did when studying MediaWiki’s was to look for their style guide. There are several ways to convey a message, and that’s why style guides are an essential tool when writing documentation: they provide guidelines which enforce consistency, setting standards to be followed, and quality references to be seeked. They are the ultimate expression of how the project communicates with people, and are therefore an important part of the brand identity. Consequently, the absence or incompleteness of a project’s style guide will have direct influence on how the readers’ perspective of it.

MediaWiki’s style guide is far from being perfect, especially as it relies too much on external references without highlighting which practices it considers the best. Unfortunately, this is a problem that is not confined solely to MediaWiki, as it shows up on other documentation like the Translation best practices. Writers end up without good and reliable resources to do their work, leading to difficulty in establishing a target audience and a proper style of writing. And users, especially new users, may face problems to understand new concepts and processes.

As a person new to the Wikimedia movement, I experienced first hand what is like to be an extremely confused and overwhelmed newcomer as I translated pages like CirrusSearch. It took me days to get used to the Translate extension workflow and weeks to understand the most basic concepts behind it. And as I learned more, I realized that my path to begin contributing with technical translations was extremely erratic and far from ideal.

The process to become a translator needs to be easy to follow and to understand. Tools and resources have to be presented briefly but effectively so newcomers are aware of where to find answers to their questions. I believe Meta:Babylon/Translations it’s the most recommended page to present to newcomers, but there should be also initiatives to improve it creating new or complementary forms of introduction and training as instructional videos. That way, we will welcome those who are new to the movement better.

Now, as much as I wish to make content available to all languages, it’s essential to focus our attention on those which are spoken by the most active communities. There is a substantial effort by Community Liaisons to provide support to those languages, including the creation of a list of active tech translators, so I used that as a reference to understand who we need to recruit.

My next step was to define a number of pages to analyze having in mind all 23 languages mentioned in the active tech translators list. As Help:Contents receives a significant amount of accesses2 and is mentioned as the reference for those looking for help on MediaWiki, I decided to study it and all the pages mentioned in it as well3.

Language Translation rate Ranking Views Monthly average Ranking
Arabic (ar) 15.85% 18 28,279 943 7
Catalan (ca) 83.18% 2 18,078 603 16
Czech (cs) 32.40% 14 20,238 675 13
German (de) 67.30% 9 63,877 2,129 2
Greek (el) 16.25% 17 3,143 105 23
Spanish (es) 59.78% 10 34,641 1,155 5
Persian (fa) 15.13% 21 20,199 673 14
Finnish (fi) 15.15% 20 18,278 610 15
French (fr) 73.78% 5 35,035 1,168 4
Hebrew (he) 27.63% 15 17,259 575 19
Hungarian (hu) 12.15% 22 18,885 596 17
Italian (it) 36.00% 13 21,373 712 11
Japanese (ja) 68.53% 8 31,117 1,037 6
Korean (ko) 46.25% 12 24,997 833 8
Dutch (nl) 20.88% 16 20,976 699 12
Polish (pl) 70.83% 6 22,829 761 9
Portuguese (pt) 74.73% 4 17,782 593 18
Portuguese (pt-br) 82.25% 3 21,893 730 10
Russian (ru) 70.30% 7 43,302 1,443 3
Swedish (sv) 8.33% 23 5,915 197 22
Turkish (tr) 15.63% 19 17,187 573 21
Ukranian (uk) 56.28% 11 17,249 575 19
Chinese (zh) 86.53% 1 69,192 2,306 1

Chinese, Catalan, Brazilian Portuguese, and European, French and Polish are the languages with the highest translation rate on mediawiki.org. However, of these six languages, only two (Chinese and French) are featured in similar positions in the ranking by average views in a month, and only four (Chinese, French, Brazilian Portuguese and Polish) are among the ten most accessed languages. On the other hand, Swedish, Hungarian, Persian, Finnish, Turkish and Arabic are the languages with the lowest translation rates. Swedish and Turkish positions are similar in both rankings. However, surprisingly, the positions of the other languages in the completion ranking and the pageviews ranking differ from lot, especially the Help: Contents page in Arabic, which is the seventh language with the most accesses.

To understand the reasons behind those numbers is not just a matter of comparing number of pageviews and translation rates; it is necessary to consider social aspects such as the proficiency in English of the speakers of those languages. Consider the EF EPI index as a reference: countries like Netherlands, Sweden, Finland, Germany, Poland, Hungary, Czech Republic and Portugal have “very high” or “high” proficiency rates. Greece, Argentina, Spain. Hong Kong, South Korea, France and Italy have “moderate” proficiency levels. And China, Japan, Russia, Taiwan, countless Latin American countries like Brazil and Colombia, Iran, Afghanistan and Qatar are among those with “low” or “very low” proficiency. This helps to explain, for example, why there is such a high demand for documentation in Arabic even though the translation rate is one of the lowest.

Other important factors are the possibility of access to Wikimedia projects (which is more difficult in countries like Turkey, recognition level of Wikimedia projects in several countries (as evidenced by the Inspire campaigns and the organization of the communities in question.

Still, while being as large as the Wikimedia Foundation and its projects comes with a set of downsides, it also comes with a good amount of advantages. Wikimedia projects are consolidated as a reference in open knowledge and are admired by thousands of people. Those who read and those who contribute believe in our values and quality of work, so the most sensible thing to do to improve the current state of translations in user guides is to ask for their help.

Translation teams usually have a small amount of people, and this works in our favor as it’s possible to make a lot of progress with few contributors. And while it’s viable to find technical translators among people who already contribute to other Wikimedia or free and open-source projects, it’s also beneficial to the Wikimedia movement and MediaWiki to look for new volunteers. After all, most of contributors already dedicate their free time to specific projects. Although I am sure some would love to find room to help (and they are welcome!), this can become overwhelming quickly.

So, to find new translators, we need to look for places where diversity is welcomed and open knowledge is valued. We also need people that speak their native language well and also understand English at, at least, an intermediate level. Because of that, reaching out to university students and professors is our best bet, given this kind of collaboration has been growing in the last few years.

Talking to professors, especially those who dedicate their studies to fields as linguistics and translation, can be a valuable source of knowledge and the beginning of a partnership with universities to help us develop, for instance, a fitting set of best translation practices for MediaWiki. This is, moreover, one of the subjects of a conversation I am having with a professor involved with the coordination of the Translation course of the Federal University of Uberlândia (UFU).

As for students, there are multiple reasons I suspect they would be wonderful contributors. While they are encouraged to learn English throughout their time in the university due to professional demands, there are little to no opportunities to make use of the knowledge they have gained outside their classrooms. In addition to that, they are stimulated to look for different but relevant extracurricular activities to perform, but most of them can’t be done from the comfort of their home.

Technical translations provide them a chance to put their fluency to a test while improving their vocabulary and reading comprehension. Translating documentation is also a great and easy way to begin contributing to Wikimedia projects, as the Translation extension offers translators an easy-paced workflow and you learn more about organizational nuances and technical details the more you translate.

Therefore, in recent weeks I have explored two fronts of work: communication with professors and others involved with university administration, publicizing the role of technical translator as an interesting extracurricular activity for students, and direct dialogue with said students, making use of promotional materials making use of the relationship between Wikipedia and MediaWiki, and directing them to a shorter version of the Translate extension user documentation. The search for these two groups is done in three ways: direct but virtually through direct communication through emails or messages on social networks such as Twitter; in person, in meetings with coordinators of language schools or undergraduate courses; indirectly through the dissemination of promotional material made by volunteer students at various universities. The test of this strategy has been done locally in my country of residence: Brazil.

There are points of failure in the whole technical translation process—that goes from the quality of the source text to the lack of a strong translation community—and the path to finally solve them is long. MediaWiki needs to look up to good examples of documentation practices, like Atlassian or Write the Docs and establish and enforce a set of good practices for its documentation. It also needs to improve its localization practices, looking up to examples as Mozilla Firefox and improving resources made for technical translators. Providing a better training, making available tutorials more based on videos or other visual resources and less on text, is a better way to introduce newcomers to the tools they will use. Simple but effective introductions, like the one provided on Meta:Babylon, are also essential and need to be more publicized. Lastly, building bridges between those who are already long-time contributors and those new to the movement is a must. While you can contact other translators through the translators mailing list, it is still a way of contact with a great amount of limitations. It isn’t a proper place to have real-time discussions and email is becoming a less used mean of communication. Promoting the establishment teams for each language, encouraging them to create and organize their own conventions for recurrent translations and writing style, and electing volunteers among them to communicate directly with newcomers will provide all of them a sense of belonging and support.

That said, the legacy of 16 years of MediaWiki development, including all the user guides available at the moment, is still relevant, useful and needs recognition as much as it needs attention. And that’s because when you dedicate a few hours of your month to translate documentation into your native language that covers important aspects of MediaWiki as editing, you still help us give users access to tools to enhance their contributions and you provide them a better understanding of the interfaces they use. And while this helps to increase the quality of the content created, the chances of enhancing the software are also higher - more conscious users generate better reports on problems they faced, improving communication between them and developers.


  1. Pages inside wikis using the MediaWiki software can be organized by namespaces to differentiate them by purpose. ↩︎

  2. Wikimedia Foundation provides a tool available for everyone with which you can access data about pageviews in various periods of time. The earliest date is January 7, 2015. ↩︎

  3. Data collected from January 5 to 8, 2018. ↩︎