GMail was playing-up something rotten yesterday – I'm pretty glad that, at the mo, I don't have to rely on it always being there. However, it IS a beta thingmy, so one shouldn't expect completely smooth operation and availability just now I suppose!
Had a faculty BBQ here yesterday – great fun (although today my head hurts (again)).
Had some early evening (read sober) time to talk through some interesting ideas on how (some) clustering-algorithms could be combined with JC's ideas on wide-coverage parsing of unrestricted text (and how that might work with the latest cut of the robust-accurate-statistical-parsing system). I missed seeing the parser in action early this month – but apparently it's really fast now. The clustering idea - combined with some basic set-theoretic predicate-calculus manipulation stuff – sounds like it would be worth exploring sometime soon.
MSR's FTP server has some completely cool papers on it, but I'll be damned if I can see a list that cross-references paper-title with a filename … so, until I do find something like that, I'd really like to figure out someway of extracting a summary from PDF, PS and DOC files! As it stands at the moment, I have to download files with names like TR-2004-39.pdf, open them, see what they're about, and then, if I keep them, rename them to something more descriptive. A pain (but a necessary one – as there are some real gems up there).
I've got 21 scripts to mark and return tomorrow! Poo!