Author Topic: Opie-reader Learns Speaking  (Read 20876 times)

malik

  • Full Member
  • ***
  • Posts: 149
    • View Profile
Opie-reader Learns Speaking
« on: February 03, 2005, 11:08:40 am »
this would be a nice application!


malik
borzoi, cacko 1.23 lite, 1gb kingston cf, 512mb toshiba sd.. suse 10.0

adf

  • Hero Member
  • *****
  • Posts: 2807
    • View Profile
    • http://
Opie-reader Learns Speaking
« Reply #1 on: February 03, 2005, 12:39:31 pm »
It definitly would.
**3100 Zubuntu Jaunty,(working on Cacko dualboot), 16G A-Data internal CF, 4G SD, Ambicom WL-1100C Cf, linksys usb ethernet,  BelkinF8T020 BT card, Belkin F8U1500-E Ir kbd, mini targus usb mouse, rechargeble AC/DC powered USB hub, psp cables and battery extenders.

**6000l  Tetsuized Sharprom, installed on internal flash only 1G sd, 2G cf

malik

  • Full Member
  • ***
  • Posts: 149
    • View Profile
Opie-reader Learns Speaking
« Reply #2 on: February 04, 2005, 03:22:21 am »
hello adf,

there are some binaries for the zaurus concerning tts and speech synthesis like
flite (all in one, but not very useful), mbrola (only synthesis) and freetts (java).
one can easily integrate dictionaries (f.i. ztrans) into the reader. using the clipboard
content and the not yet implemented ability to start external application (hello
tim:-)) opie can easily do this job or not?!

malik
borzoi, cacko 1.23 lite, 1gb kingston cf, 512mb toshiba sd.. suse 10.0

TimW

  • Sr. Member
  • ****
  • Posts: 296
    • View Profile
Opie-reader Learns Speaking
« Reply #3 on: February 04, 2005, 07:21:20 am »
opie-reader used to have an interface to flite built-in. I did it as part of a project for a blind friend but because of the large buffer built into the front end of flite it was impossible to get decent control over stopping and starting the reading and making sure the bookmarks aligned with where flite had got to in its speech.

This was a long time ago with an older version of flite so maybe it will work better with the new version. The code is still in there but disabled with a #define so it should be reasonably easy to get something working again but I suspect a more direct interface with flite may be needed (I experimented with lots of different ways of interfacing with flite in case the buffering was caused by the interface and I can't remember with what I left in).

All I need is some spare time a copy of the current version of flite - preferably x86 as well as ARM. I'll try and take a look soonish.

malik

  • Full Member
  • ***
  • Posts: 149
    • View Profile
Opie-reader Learns Speaking
« Reply #4 on: February 04, 2005, 07:44:47 am »
hello tim,

the last release is from february 2003, version 1.2. there is a gap of more
than one year to version 1.1 ( december 2001). one can find the corresponding
files here:

http://www.speech.cs.cmu.edu/flite/packed/flite-1.2/

it was only an idea. i know that the commercial stuff concerning speech synthesis
and recognition is getting better and better, maybe also the free projects. it would
be a kind of killer application and very useful. i really dont know much about
programming and compiling, but if i can help, i will try it.

malik
borzoi, cacko 1.23 lite, 1gb kingston cf, 512mb toshiba sd.. suse 10.0

adf

  • Hero Member
  • *****
  • Posts: 2807
    • View Profile
    • http://
Opie-reader Learns Speaking
« Reply #5 on: February 05, 2005, 02:41:01 am »
slightly useless observations:
flite (the inclusive binary) seems to work fine..though I haven't really asked much of it
the IBM multimodal stuff seems to recognize speech pretty well....  maybe there is some hope along those lines to be found by mining at ibm (since you mentioned recognition)?
**3100 Zubuntu Jaunty,(working on Cacko dualboot), 16G A-Data internal CF, 4G SD, Ambicom WL-1100C Cf, linksys usb ethernet,  BelkinF8T020 BT card, Belkin F8U1500-E Ir kbd, mini targus usb mouse, rechargeble AC/DC powered USB hub, psp cables and battery extenders.

**6000l  Tetsuized Sharprom, installed on internal flash only 1G sd, 2G cf

caunt

  • Newbie
  • *
  • Posts: 48
    • View Profile
Opie-reader Learns Speaking
« Reply #6 on: February 28, 2005, 04:58:25 pm »
TimW,

Wouldn't the solution be to feed the freetts engine small known chunks...like paragraphs?  Those ought to be easy enough to parse.
('course if you open a text with no paragraphs at all, like some writers create. then you are back to having to wait till the engine finishes what it's chewing on.) And then you know where to put the bookmark also.

padishah_emperor

  • Hero Member
  • *****
  • Posts: 849
    • View Profile
    • http://
Opie-reader Learns Speaking
« Reply #7 on: February 28, 2005, 07:04:16 pm »
Tim, if you could add support for flite that would be excellent, the Yopy has flite running fine and it would work well if you spat small chunks of text as suggested above to it at a time.

Support flite!!
Left Linux and Linux PDAs... sorry, got boring.  Switched to Mac.

TimW

  • Sr. Member
  • ****
  • Posts: 296
    • View Profile
Opie-reader Learns Speaking
« Reply #8 on: March 01, 2005, 06:26:37 am »
Quote
TimW,

Wouldn't the solution be to feed the freetts engine small known chunks...like paragraphs?  Those ought to be easy enough to parse.
('course if you open a text with no paragraphs at all, like some writers create. then you are back to having to wait till the engine finishes what it's chewing on.) And then you know where to put the bookmark also.
[div align=\"right\"][a href=\"index.php?act=findpost&pid=68889\"][{POST_SNAPBACK}][/a][/div]

I haven't tried freetts yet but the problem was that flite would buffer up several minutes of speech with no feedback of where it had got to. Whilst it is trivial to send a paragraph at a time it was impossible to know when to send the next paragraph because there was no way of knowing when the previous paragraph had finished. The best I came up with was to have the user tap a key when (s)he was ready for the next paragraph.

When I get some spare time I'll have a look at the current versions of flite and freetts and see whether the same problem exists.

TimW

  • Sr. Member
  • ****
  • Posts: 296
    • View Profile
Opie-reader Learns Speaking
« Reply #9 on: March 01, 2005, 06:28:17 am »
Quote
Tim, if you could add support for flite that would be excellent, the Yopy has flite running fine and it would work well if you spat small chunks of text as suggested above to it at a time.

Support flite!!
[div align=\"right\"][a href=\"index.php?act=findpost&pid=68904\"][{POST_SNAPBACK}][/a][/div]

I'll make sure to support both should I manage to get it working (unless freetts doesn't have the problem but flite still has - in which case would you settle for tapping the key for more data solution I described above?).

caunt

  • Newbie
  • *
  • Posts: 48
    • View Profile
Opie-reader Learns Speaking
« Reply #10 on: March 02, 2005, 12:19:50 pm »
Hi Tim,
I'm the one who emailed you about this and have been looking into it.  FreeTTS would be great, but unfortunately REQUIRES java1.4, which is apparently unavailable on the Z. (at least on my 5000d)  That leaves us with flite.
The SPOKEN author managed to get flite working quite well, and I've contacted him.  He hasn't released his source to the public yet, (and I don't really want to push him into it,) but he did tell me that he ended up feeding the engine one sentence at a time, and actually restarting after each sentence.  The only problem with SPOKEN is that it is no longer in development and only accepts plain text.    Marrying the remarkable cababilities of QTReader to obtain plain  text from a number of formats, to the flite engine as in SPOKEN would be ideal.

TimW

  • Sr. Member
  • ****
  • Posts: 296
    • View Profile
Opie-reader Learns Speaking
« Reply #11 on: March 02, 2005, 12:41:20 pm »
I only mentioned freetts because someone else did. At least I know why I couldn't find it, now 8^).

Anyway, I think I've found a way to use flite correctly but I need to get flite compiled as a shared library for the desktop and for the zaurus but I can't find any pre-compiled versions and my laptop doesn't have enough memory to do it itself. I'll be giving it a go once I can get the kids off the "big computer" 8^).

The other alternative is just to start flite for each paragraph as seems to be done in SPOKEN. That would work but seems a bit inefficient. I may try this out as a quick and dirty way of getting something working.

caunt

  • Newbie
  • *
  • Posts: 48
    • View Profile
Opie-reader Learns Speaking
« Reply #12 on: March 03, 2005, 01:56:41 am »
Hi Tim,

I don't know about compiling as a shared obect except the flite web site said it could be done:
http://www.speech.cs.cmu.edu/flite/doc/flite_4.html#SEC4
I do know that there is an arm binary that runs from the command line that works just fine on my 500d.  You can find a copy here:
http://cmuflite.org/packed/flite-1.2/flite_arm_bin.tar.gz

I wonder how much of a difference it would make to call the api yourself vs run the executable.  

You said there was trouble starting/stopping and aligning bookmarks.
Can you set bookmarks to a particular line of visual text?  I thought the bookmark was just for the visual screen.   At any rate, I see no code available for "stopping", so it appears our only option is careful feeding of input.

TimW

  • Sr. Member
  • ****
  • Posts: 296
    • View Profile
Opie-reader Learns Speaking
« Reply #13 on: March 03, 2005, 07:42:13 am »
Quote
I wonder how much of a difference it would make to call the api yourself vs run the executable. 

You said there was trouble starting/stopping and aligning bookmarks.
Can you set bookmarks to a particular line of visual text?  I thought the bookmark was just for the visual screen.   At any rate, I see no code available for "stopping", so it appears our only option is careful feeding of input.
[div align=\"right\"][a href=\"index.php?act=findpost&pid=69207\"][{POST_SNAPBACK}][/a][/div]

There is a certain amount of initialisation time required each time you start a program which would only be incurred once if you used flite as a shared library instead. I'm not an expert in these things but loading the app into memory, (re)locating function entry points (of main, at least) are the obvious things that spring to mind.

A bookmark is (in essence) applied to the first character that is on the screen. In theory you could apply it to any letter on the screen you want but the gui/user interation is considerably simplified if you don't have to specify exactly where on the screen you want to apply your bookmark. If you change formatting options, text size or whatever and go to the bookmark then you'll see that the only thing which is really in common is the first displayed character (give or take a few formatting marks or other stuff which is invisible in one view or the other).

As for careful feeding of text, there was just no way it was possible given that flite would swallow huge amounts of text in an instant but then take several minutes to say the text out loud with absolutely no feedback to the calling app. Anyway, calling flite for each separate paragraph seems to work quite well. I'll add a separate post below so that people don't have to read all this to notice I have something working.

TimW

  • Sr. Member
  • ****
  • Posts: 296
    • View Profile
Opie-reader Learns Speaking
« Reply #14 on: March 03, 2005, 07:55:53 am »
I've got something working using the "invoke flite for each paragraph" approach. It works surprisingly well but I was only just able to run it on my SL5000D. OTOH, I do use very highly compressed documents which require several megabytes to decompress and I managed to bork my root partition by installing more apps to it than there was space for them so it may be that you won't get memory overload if you install it right. It should work okay on any of the production Zaurii though (I think - I obviously can't guarantee it since I don't have any of them).

I also did some tweaks to reduce the memory usage compared to my first attempt but I'd already borked my zaurus by overfilling the root partition at that point (I usually use OZ so I haven't tried too hard to fix it yet). I checked it out on the desktop so I believe it works okay. The code to do it is actually pretty trivial once you decide that you don't mind the overhead of invoking flite separately for each paragraph.

If anybody has a sharp based rom and wants to try it e-mail me at timwentford at hotmail dot com.

I'll be fixing my sharp based zaurus and retrying it and then I'll have a go at opie but as the code is so simple it should be in all new opie-reader releases (though as I don't have a proper compiler for the opie rom it may be a while before it gets into the opie CVS/feeds).