Bit of History
The openZIM has been created almost 15 years ago. His primary goal was to specify a file based storage solution to efficiently store and access Wikipedia (Mediawiki) offline ; and then implement an official reader/writer software library came right after. The ZIM open spefication and the libzim are still at the core of the project. A few years later the ZIM tools have been launched to easily inspect and manipulate ZIM files on the command line.
But, over the following years, most of the project activity has grown around scrapers and we actively maintain more than 15 of them now. After a while, it has appeared that we should really start to mutualize code to avoid too much of duplicate code and rationalize maintenant. Therefore, a few libzim bindings and scraper libraries were created.
These very few years, facing an increasing demand, we have develop the project to industrialise the ZIM creation process. This effort led to the launch of the Zimfarm or our custom CMS. We have taken care as well of the proper further development of Wikipedia selection infrastructure WP1.
Today, the openZIM projects looks pretty different from what is was at start : it actually delivers far more than expected first. It has moved from a pure software project to what more and more looks like a publishing organisation.
While these efforts were led on openZIM, the Kiwix project continued his own journey. Even if this is a never ending story to maintain Kiwix software stack, this is a mature portfolio and there is no disruptive plan. But one interesting learning has been made is that people are not interested in Kiwix itself, but in the content which are made available offline. This is pretty clear if we consider the success of or Android custom apps.
in 2021, we have finally isolated the kiwix-hotspot activities in a dedicated "OffSpot silo" which helps to better appreciate what Kiwix as organisation. The OffSpot activities are strategic, but are not that much concerned by what will follow.
Because of the history and the current context, we believe that we could better deliver if we would be more focus on the content. We are convinced that we should move our focus from a software centric approach to a a more content centric approach. Without leaving (at all) the field of the software development, we should move more torward the publishing field.
Because ultimatively, users are more interested in content than in software (which are only a mean); we believe this is a way to better come to funding too.
The goals are to propose more and better content in the ZIM format, where our software stack can really make a difference.
With "better" is meant:
- Securing the content are fancy and user friendly. They don't suffer of bad layout, broken links, or this kind of weaknesses which are too often the case currently.
- More adapted content, which mean an effort at the curation level: checked revisions, selection only, any kind of curation.
With "more" is meant:
- Always secure new versions can be delivered properly
- Provide solutions to allow people to their own ZIM files following a self-service workflow
- Diversify the content offer, including non-free content.
Develop a quality content offer makes no sense if this portfolio is not comprehensive. Therefore it is important to develop a library which is comprehensive.
Following that goals will be a long journey, but we don't start with nothing. Serious achievements have been reached these last year around the scrapers, the industrialisation of publishing processes or even the quality of the library.
He would be the primary axes:
- Improve the quality of ZIM files
- Focus investments on scrapers
- Develop the JS-API to have better interactions possibles between readers and content
- Continue the development of the tooling to have a better control over the publishing of content:
- Improve the automatic Q&A chain with "zimcheck" and its integration
- Develop the CMS to have an efficient human driven / semi-automatic control about the library
- Recruit people to focus on publishing/moderation work
- Develop self-service tools:
- Continue WP1 project transformation to allow anybody to make Wikimedia projects (not only Wikipedia in English) to make selections and better choose revisions
- Improve publishing solution to allow non-tech people to publish custom-applications (maybe not even only on Android)
- Have a non-free content approach:
- Allow to have a ZIM creation and publishing tool chain which is private/propriatery and non public, primarely for the offspot (cardshop).
- Make a clear pricing for the non-free content, with a pricing which is proportional (super propriatery content are the most expensive)
- Improve the library
- Improve search/filtering/pertinence of the results shown in the library
- Develop library.kiwix.org to make a francy content store
- Allow propriatery library creation/maintenance on-demand
- Finish/Improve library integration in Kiwix ports