Hi,
I have been looking at the various means of maintaining a current WIkipedia dump in a format that is useable on the Zaurus.
At present I can say that the only really good option seems to be running wiki2static and serving the pages off Apache to a local browser - When compressed, without images, the 425325 articles off the current dump run to 616Mb, it's grown by about 100Mb in a couple of months !
ZBedic seemed to be appealing, however, the last reasonably good version of the en database is pretty ancient (in Wikipedia terms) and is a bit buggy in the rendering - of what it actually chooses to render.
My thoughts are that it's workable at the moment, if you have a high capacity storage card and squashfs to maintain an entire en Wikipedia dump, however, the growth rate is such that it may not be long before it doesn't even fit on a dedicated 1Gb card ... and then a dedicated 2Gb card.
Current attempts at conversion to ZBedic seem to be producing files around the 350Mb mark, however, they are quite buggy. Something in the wiki2bedic.pl scripts seems to be losing its way and if you try lookups on word ranges starting X,Y or Z then the application hangs . Also the formatting of the original Wikipedia pages is really being sacrificed when viewing through ZBedic.
Formatting currently supported by zbedic is fairly crude, I think you can do bold and italic.. you could do bullet tables with * characters (current wiki2bdic.pl doesn't handle this properly - certainly not at least for the ** bullet on the Sharp Zaurus articals). Tables would be a waste of time, it would be feasible to construct text bounded tables using grids comprised of +_ and | characters, however, since there doesn't seem to be any way of requesting a fixed proportion font at a certain point in the article then that idea seems to be clearly bad.
So, I'm wondering if anybody has any suggestions on the following options as a way forward. Or possible other options that may have been overlooked.
i. Add a rendering engine to Zbedic to handle enough styles for Wikipedia or general Encyclopedia type projects. - maintain conversion scripts alongside the Zbedic rendering engine.
ii. Produce a 'miniMediaWiki' system aimed at serving mediaWiki content and providing the search engine inbuilt - we could save the redundant space consumed by writing headers etc. to static pages by doing this.
iii. Produce an XML engine to serve wikipedia content from a highly compressed version of the database (search engine included). Serve that to a rendering application that provides the interface.
One point to note is that although something like zbedic is not great at producing faithful rendering of the content it IS great to navigate on a Z, it's much easier than browser navigation and can be driven without a stylus.
A second point to note though is that the browser interface is much more portable and people running PDAXROM are going to find this the better option, at least in the short term. - Not sure if they got qbedic running on PDAX yet. - I would prefer firefox though !.
So, it's pressing in that in the longer term if we want to keep up to date Wikipedia content on the Z we will need to do something about it (and possibly, eventually work out a category selection/caching solution for partial content downloads), however, as I say for the meantime you can use a browser, apache, squashfs and a large storage card.
Any thoughts folks?,
- Andy