Known Bugs

From openZIM

Jump to: navigation, search

Please report bugs to http://bugs.openzim.org/!

unspecified 
NEW (7)
RESOLVED (9)
16
total16

Current List of Bugs

IDPStatusSeverityVersionProductSummary (16 tasks)  
16*
16 I already changed the zimlib to check just the first 2 bytes of the 4 bytes version number. In future versions we can then use the remaining bytes for a minor number. The current zimlib then will ignore, if this minor number is not zero, while zimwriter writes a zero minor number.
P1RESOLVEDenhancementunspecifiedopenZIMVersion & Subversion ZIM format handling 
1*
1 This is not included in the new ZIM format. I have to test it.
P2NEWenhancementunspecifiedopenZIMHandling of categories inside the ZIM format 
10*
10

I added 2 configure options --with-cluster-cache-size=number and --with-dirent-cache-size to tune default cache sizes. Additionally also the environment variables ZIM_CLUSTERCACHE and ZIM_DIRENTCACHE are read, which when set override the cache sizes.

The cluster cache caches uncompressed clusters. By default zimlib caches the last 16 clusters and with the default cluster size of 1MB this means 16MB of memory usage. For small devices the cache size may be reduced.

The dirent cache is by default 512. Dirents are not that big. They take about 30 bytes, so it makes not that much sense to reduce that value.
P2RESOLVEDenhancementunspecifiedopenZIMcaching on low memory devices 
4*
4

or a central "registry" which holds the UUIDs... we might need UUIDs to mark the content and another flag to mark language and the version.

So all "Wikipedia full dumps" made by the WMF would have the same UUIDs, but different languages and different versions (WMF might use the timestamp here).

WP 1.0 project would use another UUID for their selection. If there are different selections (with different criterias which are independant from existing selections) it would have different UUIDs.
P3NEWenhancementunspecifiedopenZIMNew header fields 
11*
11

Currently, the ZIM format can only support a limited and predefined set of document mime-types.

You can get the list of supported mime-types here: http://www.openzim.org/ZIM_File_Format#Mime_types

Mime-types are represented by a number in a ZIM file, and the mapping is done statically by the zimlib.

This means than you can not store document with custom mime-type.

This is a problem for me because I have people who use Kiwix and deal with other mime-types, for example: archives or binaries.

I think, this is a necessary improvement to make this mime-type table dynamic.

I see two solutions :

  • The ZIM file creator specify it manually during the creation process.
  • It goes automatically (also during the creation process).
P3NEWenhancementunspecifiedopenZIMDynamic mime-types 
2*
2 zimlib compiles now with cl.exe, the MS c++ compiler. We do not have a Makefile... but is it necessary?
P5NEWenhancementunspecifiedopenZIMAdd Windows/ReactOS support 
7*
7

lzma (de)compression is from now in svn trunk.

portability checks are currently ongoing.
P5NEWenhancementunspecifiedopenZIMImprove compression 
14*
14

We need to have a way to check the quality of of zim file.

Should be at least checked:

  • (WARNING) has a welcome page
  • (ERROR) broken local HTML links
  • (WARNING) redundant content
  • ...
P5NEWenhancementunspecifiedopenZIMzim-check 
15*
15

To print to STDOUT (without using a string base third part log framework), this is already the best. You are right.

But, for the rest, I think, a toString() method would be welcome.
P5NEWenhancementunspecifiedopenZIMzim::File getUuid() should return an HEX encoded MD5 hash 
3*
3 There is a uuid-field already in zim files. This is a 16 bytes random field. It is filled by a md5 sum of the current timestamp induding microseconds as returned by the system function gettimeofday.
P5RESOLVEDenhancementunspecifiedopenZIMAdd a unique identifier header field 
5*P5RESOLVEDmajorunspecifiedopenZIMzimreader on arm: miss-aligned dirent header 
6*P5RESOLVEDblockerunspecifiedopenZIMzimreader on arm: cxxtools/m4/asmtype.m4 broken for arm architecture 
8*
8 Is that an option to add to the configure an option to avoid the compilation of the test framework like --enable-tests ? Otherwise the dev. is forced to installed cxxtools what is not really great to my opinion.
P5RESOLVEDnormalunspecifiedopenZIMzimlib unable to simply compile without cxxtools 
9*
9

Bug #10 should resolve that. It is not reasonable to create device specific zim files. Instead Bug #10 makes the number of clusters, which are cached when reading configurable. When the default cluster size of 1MB is used, and the cache size is set to 0, this 1MB is needed, which I feel is a reasonable value. I don't think we need to support devices, which have less memory.

1MB is also the value, which is normally used in compression libraries. So it increasing the value does not lead to better compression ratios. On the other side smaller values lead to worse compression ratios.
P5RESOLVEDenhancementunspecifiedopenZIMoptimize cluster size (small devices limitations) 
12*
12

Reason for the problem was a failed data transfer of zim file. Zimlib correctly replied with a exception when trying to read the cluster, which was not in the file.

While investigating the reason for the problem (it was actually no bug), I discovered, thatn zimwriter does not check, if the file was successfully written. This is fixed now.
P5RESOLVEDblockerunspecifiedopenZIMZimReader freezes 
13*
13

zim::File::find returns always the next article when no excact match was found.

There is another method "zim::File::findx(char ns, const QUnicodeString& title, bool collate = false)". This returns a "std::pair<bool, const_iterator>". The first element is true, if a excact match was found and false otherwise. The second element returns the const_iterator. zim::File::find is actually a wrapper around findx, which throws the bool-flag away and returns just the const_iterator.

You may as well use zim::File::getArticle. This uses findx as well but returns a null article (article.good() returns false) if no exact match was found.
P5RESOLVEDnormalunspecifiedopenZIMzim::File::find always returns an article 
Personal tools