Author Topic: Anyone Got A Working Wiki2bedic.pl  (Read 14732 times)

rafm

  • Full Member
  • ***
  • Posts: 145
    • View Profile
Anyone Got A Working Wiki2bedic.pl
« Reply #45 on: April 22, 2005, 08:04:22 am »
Quote
I'll give it a go today. Hopefully the wireless is up to it - I'll be downloading straight to the Z.

EDIT: Nope. Not a chance. I'll be downloading this one at home.
[div align=\"right\"][a href=\"index.php?act=findpost&pid=76125\"][{POST_SNAPBACK}][/a][/div]

Just let me know if it works for you.
SL-C1000 w/ Cacko ROM 1.23

tovarish

  • Sr. Member
  • ****
  • Posts: 297
    • View Profile
Anyone Got A Working Wiki2bedic.pl
« Reply #46 on: April 22, 2005, 06:45:53 pm »
it worked for me but lot of the text had "\n"s in them.
its nice though to have it in the Z

tovarish

BarryW

  • Hero Member
  • *****
  • Posts: 690
    • View Profile
    • http://
Anyone Got A Working Wiki2bedic.pl
« Reply #47 on: April 22, 2005, 10:49:14 pm »
Fixed version works great.  Quick question though, which set of fonts has all the cool extra's like the pi symbol and stuff like that??
What's this button do??

C3100
Distro changes almost weekly...

C3200
Distro also changes almost weekly...  :)

Hardware hacks and stuff.

ZDevil

  • Hero Member
  • *****
  • Posts: 1998
    • View Profile
    • http://
Anyone Got A Working Wiki2bedic.pl
« Reply #48 on: April 23, 2005, 08:06:18 am »
Thanks for the efforts!  To me the wikipedia dump itself is quite a killing factor for getting a Z.  
Got the same issue as posted by others: quite a number of links and texts become either /n or /n*.  And I also find differences betweent the entries on the website and the dump.

Please keep it up!!  Look forward to seeing a more improved version!

Life is too precious for hacking *too much*
Visit my Z screencap gallery[/color]
My EeePC 701 Black = Debian (Lenny) on IceRocks + Transcend SDHC Class6 8GB + 2GB RAM
My Zaurus SL-C3200 = Debian EABI (kernel 2.6.24.3-yonggun) on a swapped internal Sandisk Extreme III CF 16gb
My Debian EABI feed: http://matrixmen.free.fr/zaurus/debian/
My OpenBSD/Zaurus feeds:  Link1, Link2
[/i][/font][/color][/size]

rafm

  • Full Member
  • ***
  • Posts: 145
    • View Profile
Anyone Got A Working Wiki2bedic.pl
« Reply #49 on: May 06, 2005, 12:01:56 pm »
Quote
Fixed version works great.  Quick question though, which set of fonts has all the cool extra's like the pi symbol and stuff like that??
[div align=\"right\"][a href=\"index.php?act=findpost&pid=76492\"][{POST_SNAPBACK}][/a][/div]

Math symbols probably won't work if there are shown in Wikipedia as images.
SL-C1000 w/ Cacko ROM 1.23

iamasmith

  • Hero Member
  • *****
  • Posts: 1248
    • View Profile
Anyone Got A Working Wiki2bedic.pl
« Reply #50 on: May 06, 2005, 12:13:27 pm »
There are quite a few blank articals (try CoventGarden) and still quite a few embedded line feeds that haven't been interpreted in the WIKIPEDIA stuff.

Every time I look at these scripts I think it's an uphill struggle because I don't know perl well enough... might have a go at something in C++.
OpenBSD 4.2 -current on full 4Gb of SL-C3000
Microdrive replaced with 4Gb SanDisk Extreme III card

kahm

  • Hero Member
  • *****
  • Posts: 657
    • View Profile
Anyone Got A Working Wiki2bedic.pl
« Reply #51 on: May 06, 2005, 01:02:25 pm »
Another blank article is A-10ThunderboltII. It has some on-screen corruption under the title as well.
Fujitsu U8240 "Stormtrooper" -  Zaurus Supplement
Libretto U100 | Sony Librie, Sony Reader
SL-C3100: Sharp 1.11JP (Kanji Dictionary/Translator) - LCD Top swap with C1000.
SL-C3000: pdaXii13 5.4.7, SL-C3000 5.4.9 - microdrive replaced with 8gb Sandisk
SL-C1000: PDAXRom Beta3 | SL-6000L: Sharp 1.12 | SL-5500: Cacko, 64-0 kernel | SL-5000D: OZ-Opie
Linksys WCF12; Sharp CE-AG06, CE-RH2, CE-170TS; iRiver USB OTG Host cable; Socket BT rev.E CF; Hitachi 6gb Microdrive

rafm

  • Full Member
  • ***
  • Posts: 145
    • View Profile
Anyone Got A Working Wiki2bedic.pl
« Reply #52 on: May 11, 2005, 10:10:38 am »
Does anyone have a version of wiki2bedic.pl that would work on the latest SQL dumps? My version (marked in the comments 0.9 (7.1.2004)) only goes into infinite loop when run on de.wikipedia or pl.wikipedia.

It may be good idea to put wiki2bedic.pl under the cvs of the bedic SourceForge project. I would also make  some links from the zbedic home page to that file.
SL-C1000 w/ Cacko ROM 1.23

lucho

  • Jr. Member
  • **
  • Posts: 57
    • View Profile
    • http://
Anyone Got A Working Wiki2bedic.pl
« Reply #53 on: May 11, 2005, 10:39:52 am »
I have a version that is (somewhat) working. At least it doesn't go to an infinite loop. The content of the entries is not perfect -- i see '\n', {sa} etc., but I don't have time (and free space on my laptop) to fix it.

chrisg

  • Newbie
  • *
  • Posts: 10
    • View Profile
Anyone Got A Working Wiki2bedic.pl
« Reply #54 on: May 12, 2005, 11:35:49 am »
I am currently working on new dumps for wikipedia (checkout http://www.crispy-cow.de/wikimedia/). Hope Rafal ("rafm") and me can work together on improving things  

BarryW

  • Hero Member
  • *****
  • Posts: 690
    • View Profile
    • http://
Anyone Got A Working Wiki2bedic.pl
« Reply #55 on: May 12, 2005, 12:01:07 pm »
Quote
Quote
Fixed version works great.  Quick question though, which set of fonts has all the cool extra's like the pi symbol and stuff like that??
[div align=\"right\"][a href=\"index.php?act=findpost&pid=76492\"][{POST_SNAPBACK}][/a][/div]

Math symbols probably won't work if there are shown in Wikipedia as images.
[div align=\"right\"][a href=\"index.php?act=findpost&pid=78555\"][{POST_SNAPBACK}][/a][/div]


They were there with the last version, maby it's changed.
What's this button do??

C3100
Distro changes almost weekly...

C3200
Distro also changes almost weekly...  :)

Hardware hacks and stuff.

rafm

  • Full Member
  • ***
  • Posts: 145
    • View Profile
Anyone Got A Working Wiki2bedic.pl
« Reply #56 on: May 12, 2005, 12:22:01 pm »
Quote
I have a version that is (somewhat) working. At least it doesn't go to an infinite loop. The content of the entries is not perfect -- i see '\n', {sa} etc., but I don't have time (and free space on my laptop) to fix it.
[div align=\"right\"][a href=\"index.php?act=findpost&pid=79271\"][{POST_SNAPBACK}][/a][/div]

Could you put your version of the script to the CVS of the bedic SF project.

Thanks.
SL-C1000 w/ Cacko ROM 1.23

iamasmith

  • Hero Member
  • *****
  • Posts: 1248
    • View Profile
Anyone Got A Working Wiki2bedic.pl
« Reply #57 on: May 12, 2005, 12:24:35 pm »
Quote
I am currently working on new dumps for wikipedia (checkout http://www.crispy-cow.de/wikimedia/). Hope Rafal ("rafm") and me can work together on improving things 
[div align=\"right\"][a href=\"index.php?act=findpost&pid=79452\"][{POST_SNAPBACK}][/a][/div]

I have been looking at the quality of some of the dumps in BEDIC format, have seen some of the \n {} type artifacts, blank articals etc. and have always felt a little impotent about being to help given that I really don't have the pre-requisite perl skillsets necessary to intimately understand the scripts. I am thinking about producing something in C++ capable of doing this with extensible markup translation parsers for this project. Initially I have written an if extending ring-buffer module capable of reading articals into memory for processing using the minimum amount of RAM but accomodating some of the larger articals.

I have tested this ring buffer technique allowing it to read the current dumps which include archive articals approximately 3Mb in size (giving ~1500 x 512byte buffers in the ring).

My next step is to work on the markup translation and therefore am going to need a complete understanding of the markup used in the Wikipedia articals (this should be fairly easy - I expect that this takes the documented Wiki tags directly) and the ZBedic markup tags.

I know that the libbedic/doc directory describes the database format and tags used in the markup but I wanted first of all to check if this is up to date or if I should be pulling the markup render apart for Zbedic to determine new tags.... or if anyone has a more up to date list of markup tags could they possibly share please ?

- Andy
OpenBSD 4.2 -current on full 4Gb of SL-C3000
Microdrive replaced with 4Gb SanDisk Extreme III card

zuli

  • Newbie
  • *
  • Posts: 1
    • View Profile
Anyone Got A Working Wiki2bedic.pl
« Reply #58 on: May 28, 2005, 09:21:58 am »
There are new wikis in German and English made from Christian Geyer on his
homepage http://www.crispy-cow.de/wikimedia/

Uli

kahm

  • Hero Member
  • *****
  • Posts: 657
    • View Profile
Anyone Got A Working Wiki2bedic.pl
« Reply #59 on: May 28, 2005, 03:02:33 pm »
It doesn't look like he's got the English Wikipedia up there yet. Just the German one.
Fujitsu U8240 "Stormtrooper" -  Zaurus Supplement
Libretto U100 | Sony Librie, Sony Reader
SL-C3100: Sharp 1.11JP (Kanji Dictionary/Translator) - LCD Top swap with C1000.
SL-C3000: pdaXii13 5.4.7, SL-C3000 5.4.9 - microdrive replaced with 8gb Sandisk
SL-C1000: PDAXRom Beta3 | SL-6000L: Sharp 1.12 | SL-5500: Cacko, 64-0 kernel | SL-5000D: OZ-Opie
Linksys WCF12; Sharp CE-AG06, CE-RH2, CE-170TS; iRiver USB OTG Host cable; Socket BT rev.E CF; Hitachi 6gb Microdrive