Author Topic: Serving Wikipedia Locally On The Z  (Read 6000 times)

ShiroiKuma

  • Hero Member
  • *****
  • Posts: 900
    • View Profile
Serving Wikipedia Locally On The Z
« on: October 23, 2006, 10:17:17 am »
With the availability of Wikipedia dumps I'm searching for the best way to serve these and browse them in a web browser locally on the Z.

There has been some discussion - recent - of a user using wiki2static.pl to convert Wiki dumps and using a squashfs file image to view them on a Zaurus. However the dump format is new now, XML I think.

So I'd like to ask whether anyone is doing it and what are you using for this. I've seen this which seems like a good choice, since I'm already running LAMP on the Z. However you don't get images...

Any recommendations on what the best way to do it on the Z? Obviously it would have to be a squashfs image in order to fit the data on a card. But how to do it? Like the above post? Convert on a PC to a MySQL format? Anyone doing it this way or another?
[span style=\'font-size:8pt;line-height:100%\']Das ganze tschechische Volk ist eine Simulantenbande.[/font][/span]
Militäroberarzt Bautze

stbrock

  • Full Member
  • ***
  • Posts: 149
    • View Profile
    • http://
Serving Wikipedia Locally On The Z
« Reply #1 on: October 29, 2006, 05:43:15 pm »
I would be interested in this also, and believe Meanie said at one point that he was working on this. Perhaps one day soon he will surprise us with a post.  

Rather than running mediawiki, apache, etc, there is an alternative approach that sounded appealling at first -- converting the data files to compressed html files that could be browsed directly. However, after a little investigation, it seems that there is so much variation in file format and loss of functionality involved that it did not work very well, and the conversion programs I saw didn't seem to be actively maintained.

Jon_J

  • Hero Member
  • *****
  • Posts: 1853
    • View Profile
    • http://
Serving Wikipedia Locally On The Z
« Reply #2 on: October 29, 2006, 06:31:44 pm »
I don't know if this is off topic.
I use Zbedic and after looking at Meanie's page, I downloaded
"en-wikipedia_0.9.5_20050209.dic.dz"
It doesn't have any pictures, just text.
Google should find it for you. It is 412MB and slows down launching of zbedic.
I just put it on my C3100 hard drive in same directory as the dictionary file that I normally use with zbedic.
/hdd3/QtPalmtop/share/zbedic/
It only slows down zbedic if it's enabled in the dictionary selection screen.
I think, I'll try making another copy of zbedic and rename it to zbedic2.
With 2 different copies of zbedic, I could load one with the usual dictionary file for fast lookup, and load the second one with the wikipedia, since it launches so slowly

EDIT: I tried making a second copy of the binary zbedic as zbedic2 and I also put it in a different tab, but if I load the wikipedia into one of them, it's loaded into both of them.
This kind of makes the "Dictionary" function of zbedic usless for a quick word lookup app.
I'm sure this is the reason, I tried the wikipedia once in zbedic and removed it.
I thought having 2 binaries with different names would allow me to run two different instances of zbedic and have one loaded with just the dictionary, and the other loaded with both the wikipedia and dictionary.
I have used this same method of copying and renaming a binary once before, and it still works for qkonsole.
(I have 2 copies of qkonsole in 2 different tabs. One runs in magnified mode, the other runs in "normal" mode)
« Last Edit: October 29, 2006, 07:07:24 pm by Jon_J »
C3100 Multiboot-->Angstrom 2007.12-r18 | Cacko 1.23 | ArchLinuxARM
C3200 pdaxii13v2-5.5-alpha4 Akita on NAND

Ambicom WL1100C-CF Wifi - Ambicom CF modem - Ambicom CF GPS - Belkin-F5D5050 USB LAN
Socket CF Bluetooth rev K - Iogear 4 port USB micro hub - pocket CF card reader
Targus mini USB optical mouse - 2 Targus SD card readers

desertrat

  • Hero Member
  • *****
  • Posts: 743
    • View Profile
    • http://
Serving Wikipedia Locally On The Z
« Reply #3 on: October 29, 2006, 08:37:59 pm »
Quote
I thought having 2 binaries with different names would allow me to run two different instances of zbedic and have one loaded with just the dictionary, and the other loaded with both the wikipedia and dictionary.
Both copies of the binary are sharing the same configuration files. What you need is 2 sets of configuration files and some way of switching between them, rather than 2 binaries.
SL-C3100 / Ambicon WL1100C-CF / pdaXrom 1.1.0beta3 / IceWM

rafm

  • Full Member
  • ***
  • Posts: 145
    • View Profile
Serving Wikipedia Locally On The Z
« Reply #4 on: October 30, 2006, 09:54:44 am »
Quote
It doesn't have any pictures, just text.
Google should find it for you. It is 412MB and slows down launching of zbedic.
[div align=\"right\"][a href=\"index.php?act=findpost&pid=145069\"][{POST_SNAPBACK}][/a][/div]

The current solution is to to check the "fast load" box for zbedic, if you do not mind having less RAM. The right solution would be implementing "lazy" loading of dictionaries, but I have never had time to do it.

Images for wikipedia would probably take too much space. It is possible to have images in zbedic starting from 1.1, but nobody has so far tried it with Wikipedia.

To have two copies of zbedic, you would probably need to edit "control" files in .ipk. But I do not recommed this solution.
SL-C1000 w/ Cacko ROM 1.23

Jon_J

  • Hero Member
  • *****
  • Posts: 1853
    • View Profile
    • http://
Serving Wikipedia Locally On The Z
« Reply #5 on: October 30, 2006, 12:24:09 pm »
I decided to disable wikipedia in the dictionary selector.
Launch Zbedic with just English dictionary - 3½ seconds.
Launch Zbedic with wikipedia & English dictionary - 8½ seconds.

I don't like using fast load.
I disable fast load on any app that I install, that has it enabled.
I'm keeping wikipedia on my HDD, so I can re-select it again if I need it.
C3100 Multiboot-->Angstrom 2007.12-r18 | Cacko 1.23 | ArchLinuxARM
C3200 pdaxii13v2-5.5-alpha4 Akita on NAND

Ambicom WL1100C-CF Wifi - Ambicom CF modem - Ambicom CF GPS - Belkin-F5D5050 USB LAN
Socket CF Bluetooth rev K - Iogear 4 port USB micro hub - pocket CF card reader
Targus mini USB optical mouse - 2 Targus SD card readers

stbrock

  • Full Member
  • ***
  • Posts: 149
    • View Profile
    • http://
Serving Wikipedia Locally On The Z
« Reply #6 on: October 30, 2006, 11:37:23 pm »
The wiki2zaurus homepage hasn't been updated in a long time and the latest downloadable version there of the converted wikipedia is quite old. From what I read about changes in format in the wikipedia, it seemed that the old perl scripts would need a good bit of work to process the current wiki files but I could be wrong. Searching wiki2zaurus on this forum gives a couple of recent threads that suggest someone may be working on this or another approach.

Overgauss

  • Newbie
  • *
  • Posts: 14
    • View Profile
Serving Wikipedia Locally On The Z
« Reply #7 on: November 10, 2006, 12:25:24 am »
Wikipedia needs to be on my Z. NEEDS.

speculatrix

  • Administrator
  • Hero Member
  • *****
  • Posts: 3706
    • View Profile
Serving Wikipedia Locally On The Z
« Reply #8 on: November 30, 2006, 07:34:11 am »
You don't necessarily need apache installed on the Z to run cgi scripts - check out my posting on security/networking for a shell-script web server which will run a CGI, and take a look in general discussion for my squashfs-oesf-forum-archive.

maybe you can get a minimal mediawiki cgi running with a minimal mysql db backend.
Gemini 4G/Wi-Fi owner, formerly zaurus C3100 and 860 owner; also owner of an HTC Doubleshot, a Zaurus-like phone.

seiichiro0185

  • Newbie
  • *
  • Posts: 29
    • View Profile
    • http://seiichiro0185.xen-host.de
Serving Wikipedia Locally On The Z
« Reply #9 on: November 30, 2006, 09:32:09 am »
a solution for a small wikipedia-specific server would be wikijserver: http://wikijserver.achterliek.de/ (parts in german)

it runs pretty good on my Z with the german "Exzellente Artikel" dump from the site, only problem is there is no dump for the complete wikipedia around, and so far I found no info on how to generate one. But IMHO this is the best solution for a complete wikipedia on the Z given we get a recent dump...

seiichio0185

PS.: for all who want to try it, the wikijserverl_arm_1_4_12_all.ipk doesn't work with jeode or PersonalProfile (on cacko) but the ewe version with the ewe-runtime works without problems.
C1000
Cacko 1.23-lite
Networking: Ambicom WL1100C-CF | Linksys USB200M | Sitecom CN-512 USB-Bluetooth
Storage: 2GB+512MB SD, 4x512MB CF, 6GB Microdrive
My Homepage Infos about my Zaurus setup, Linux and other stuff (WiP)

zi99y

  • Sr. Member
  • ****
  • Posts: 282
    • View Profile
Serving Wikipedia Locally On The Z
« Reply #10 on: December 03, 2006, 03:13:18 pm »
Here is a list of toolkits for converting the data: http://meta.wikimedia.org/wiki/Alternative_parsers

Notably tomeraider is listed which is intended for portable devices but I know nothing that can read them on the Z.

My idea is to translate the XML into portabase format ( http://portabase.sourceforge.net/portabase_xml.html ) although I've no idea how yet.

Does anyone use portabase to know if it would handle a large database?
« Last Edit: December 03, 2006, 03:14:35 pm by zi99y »

Tron

  • Newbie
  • *
  • Posts: 47
    • View Profile
Serving Wikipedia Locally On The Z
« Reply #11 on: June 10, 2007, 10:25:49 am »
Hi all,

just to get this topic back into discussion - I do think that the way wiki2static works (compress and reformat downloaded dump, then use apache with cgi to generate pages and do searches) is the way to go. Installing a complete wiki with mysql to access the database is probably too much for the Zaurus, not only concerning the available storage space.

As I'm not really satisfied with the *bedic variants (neither "q" nor "z"), wouldn't it be great if someone with knowledge of perl had a look at wiki2zaurus? Maybe that someone could get it working again...
YT,
Tron