![]() ![]() |
Sep 18 2006, 01:56 AM
Post
#16
|
|
|
Group: Members Posts: 318 Joined: 25-February 04 From: UK Member No.: 2,025 |
QUOTE(paka @ Aug 31 2006, 06:38 AM) So I have an offer, if anyone can get the dictionary files from this site (they are for a program that is open source with code on the site): http://kldp.net/frs/?group_id=73 into zbedic format, I will pay them $100 for their time....just a little incentive to maybe contribute something to the community. I'd do it myself, but my C skills are kind of lacking.... Anyone who thinks they would be willing to take on the project, PM and we can work out the details... It really isn't that difficult. I have converted several Thai dictionaries to zbedic format and am in process of converting some more. I write a PERL script to convert from the original format to zbedic basic format and then use the programs supplied with bedic to make it into a dictionary file. I use PERL to extract the different parts of the word definition and generate the output. However, the difficult bit is that you need to know something about the language in order to make a decent quality conversion. |
|
|
|
Sep 18 2006, 02:10 AM
Post
#17
|
|
|
Group: Members Posts: 318 Joined: 25-February 04 From: UK Member No.: 2,025 |
OK, I had a quick look at the dictionary file and the ldic source. It doesn't look difficult to extract the definitions from the dictionary. However, the original dictionary file looks like it might be in mark up but I can't quite tell if the coding is UTF-8 or something Korean specific.
I think it would be quite easy to hack a short program to load the dictionary and spew out bedic simple format but you'd need to know more about the character coding in order to complete the conversion. |
|
|
|
| Guest_ttkman_* |
Sep 18 2006, 04:53 AM
Post
#18
|
|
Guests |
QUOTE(koan @ Sep 18 2006, 02:10 AM) OK, I had a quick look at the dictionary file and the ldic source. It doesn't look difficult to extract the definitions from the dictionary. However, the original dictionary file looks like it might be in mark up but I can't quite tell if the coding is UTF-8 or something Korean specific. I think it would be quite easy to hack a short program to load the dictionary and spew out bedic simple format but you'd need to know more about the character coding in order to complete the conversion. Dear koan, as you might have read in the hole thread before, we know how to make that dic. The problem is just to extract it from the ldic-source. And that is problematic first related to the lack of C-knowledge by those who are interested in this dictionary. So if you could hack something together, do it Btw. I made a zbedic-dic from that pdf I wrote about ... its not perfect yet, but quit usable ... if someone wants it ... send me a pm ... also I do have "other" korean-engl. dics ... just got them from a nice korean guy. But all together they are quit big and as I don't know, whether they are copyright-protected or not, I intend to not share them officially. So if someone is interested, send me a pm too ... we will work out a way. greetings Thomas btw: I will be in Japan from WED this week, I don't know when I will be able to get a line ... so please be patient. |
|
|
|
Sep 19 2006, 04:56 AM
Post
#19
|
|
|
Group: Members Posts: 318 Joined: 25-February 04 From: UK Member No.: 2,025 |
QUOTE(ttkman @ Sep 18 2006, 04:53 AM) as you might have read in the hole thread before, we know how to make that dic. The problem is just to extract it from the ldic-source. And that is problematic first related to the lack of C-knowledge by those who are interested in this dictionary. So if you could hack something together, do it I managed to compile the ldic program although I got a bit confused because I don't have a Korean font (couldn't see any output). Apart from that it looks OK, very basic GUI. I'm thinking that extracting the dictionary info should be straightforward but putting it in a sensible bedic file might be tricky because I only know 2 phrases in Korean. koan |
|
|
|
| Guest_ttkman_* |
Sep 19 2006, 05:29 AM
Post
#20
|
|
Guests |
if we are able to extract the hole dic out of the ldic-program, we have a basic structure, then we should be able to use a bash or perl script to convert it ... but really, my C-skillz are sooo bad ... I would be glad if you perhaps just could review the ldic-code and change it to send the hole data to stdout ... I don't know if you have time or not to do this, or if your skills are good enough. Perhaps i will try it sometimes by myselfe, but right now I am bothered with learning korean and japanese, so I don't really have time to focus myselfe on that.
So ppl, please do something ... thomas |
|
|
|
Sep 20 2006, 12:39 AM
Post
#21
|
|
|
Group: Members Posts: 318 Joined: 25-February 04 From: UK Member No.: 2,025 |
QUOTE(ttkman @ Sep 19 2006, 05:29 AM) So ppl, please do something ... I'll have a go but it's not top priority for me - I'm converting 3 Thai dictionaries at the moment. By the way, how big is the "small" dictionary already available from the bedic site ? koan |
|
|
|
Sep 30 2006, 12:59 PM
Post
#22
|
|
|
Group: Members Posts: 318 Joined: 25-February 04 From: UK Member No.: 2,025 |
Hi guys
Here is a screenshot from the current status: ![]() Can someone tell me if it's somewhere near correct ? This is a quick attempt to parse the file so it doesn't utilise all the bedic features, hence some strange things like "2. 1." etc. thanks koan |
|
|
|
Oct 2 2006, 10:14 AM
Post
#23
|
|
|
Group: Members Posts: 303 Joined: 6-February 04 Member No.: 1,740 |
ttkman,
Where can I find the dictionaries that you mentioned earlier? |
|
|
|
Oct 16 2006, 02:02 PM
Post
#24
|
|
|
Group: Members Posts: 318 Joined: 25-February 04 From: UK Member No.: 2,025 |
Check the previous post screenshot, it has been updated.
(Work in Progress) cheers koan |
|
|
|
Oct 17 2006, 12:22 PM
Post
#25
|
|
|
Group: Members Posts: 303 Joined: 6-February 04 Member No.: 1,740 |
QUOTE(koan @ Sep 30 2006, 12:59 PM) Hi guys Here is a screenshot from the current status: ![]() Can someone tell me if it's somewhere near correct ? This is a quick attempt to parse the file so it doesn't utilise all the bedic features, hence some strange things like "2. 1." etc. thanks koan Well, the characters are Korean for sure and it looks good. However, can't tell you if those characters make correct Korean words |
|
|
|
Oct 19 2006, 03:24 AM
Post
#26
|
|
![]() Group: Members Posts: 11 Joined: 11-March 06 From: Brisbane, Australia Member No.: 9,346 |
QUOTE(koan @ Oct 1 2006, 06:59 AM) Hi guys Here is a screenshot from the current status: ![]() Can someone tell me if it's somewhere near correct ? This is a quick attempt to parse the file so it doesn't utilise all the bedic features, hence some strange things like "2. 1." etc. thanks koan Well, my Korean sharemate said that it is correct |
|
|
|
Oct 24 2006, 10:17 AM
Post
#27
|
|
|
Group: Members Posts: 318 Joined: 25-February 04 From: UK Member No.: 2,025 |
QUOTE(coklat @ Oct 19 2006, 03:24 AM) There are still many issues to fix that are not visible in the screenshot. I am trying to sensibly separate the different sub-senses, parts of speech, categories etc. by developing a set of rules for the script that does the conversion. Also, I am trying to make the best conversion between the original format and bedic format. Please understand, I think it is better to do a good job of the conversion rather than upload a half baked mess that gets distributed widely. Do it once, properly and everyone can use a good quality dictionary. It may take a little bit of time but the wait will be worth it. koan |
|
|
|
Oct 24 2006, 11:54 AM
Post
#28
|
|
|
Group: Members Posts: 303 Joined: 6-February 04 Member No.: 1,740 |
QUOTE(koan @ Oct 24 2006, 10:17 AM) There are still many issues to fix that are not visible in the screenshot. I am trying to sensibly separate the different sub-senses, parts of speech, categories etc. by developing a set of rules for the script that does the conversion. Also, I am trying to make the best conversion between the original format and bedic format. Please understand, I think it is better to do a good job of the conversion rather than upload a half baked mess that gets distributed widely. Do it once, properly and everyone can use a good quality dictionary. It may take a little bit of time but the wait will be worth it. koan That's the best approach. Good luck. Do you think your script could be useful for other attempts to convert other formats into zbedic format? |
|
|
|
Jan 2 2007, 06:34 AM
Post
#29
|
|
|
Group: Members Posts: 318 Joined: 25-February 04 From: UK Member No.: 2,025 |
Hi
paka and I managed to finish the conversion of these dictionary files. If you are interested in downloading, please go to my Zaurus Dictionaries Page. thanks koan |
|
|
|
![]() ![]() |
|
Lo-Fi Version | Time is now: 22nd May 2013 - 12:14 PM |