-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intent to participate #29
Comments
Ok, so I made this thing that runs throught the wordnet ontology starting from a root noun and going over every hyponym recursively, and dumping definitions and names in to sentence templates. I started off with animal (after the Celestial Emporium of Beneficial Knowledge), but that does not yield quite enough words, so then I ran it for entity. Might need some formatting into a nice pdf, but am having issues with pandoc. I'm not feeling done with NaNoGenMo, yet, though. --updated comment after moving around with repositories |
Interesting animal stays almost animally all the way down, but entity quickly spreads out. Seems to be a problem with capital vowels:
And any chance to detect mass nouns?
|
I hadn't noticed that capitalization doesn't work for capital vowels. I used out-of the box functions from Pattern and NLTK for everything with some workarounds for the most obvious issues. I had noticed the mass nouns:
And I figured I could maybe infer whether to use the article from the use of articles in the usage examples for each synset, but that is not reliable enough. Also, some synset definitions lack an initial 'a/an'
But just adding a/an might improperly catch mass nouns. I'm satisfied with this for now, as I'm more interested in exploring meaning than getting the details of grammar in order for my entry for NaNoGenMo. |
Forked the repo, moved the print-wordnet thingy into it and added some other things I'm working on to it. In extract-phrases I've hacked together thephrase extraction utility from patent-generator to extract two different kinds of text chunks to make a single huge sentence. Could be longer but I has some kind of error with the gutenberg header cleaning that halted the extraction process prematurely. It yields something quite similar in tone as @cpressey 's poetic inventory. Extract from the generated novel:
Lastly, in segmented-markov I'm trying to mess with a Markov language model based on Peter Norvig's letter n-gram counts to generate a weighted random string of characters, which then get pushed through his text segmentation functions, yielding stuff like:
Python gets into recursions fast when segmenting the text with Norvig's code, so strings larger than about 200 chars will give a RuntimeError: maximum recursion depth exceeded. Might generate a 250.000 character string and pass it in 100 char chunks to it. Could also write a weighted monkey script banging out Cicero with it? Or it could be combined with checkerboard layout, perhaps? Maybe toss out the junk? |
Interesting stuff. Note that it is possible to increase Python's recursion depth limit, if you think it will help, at your own peril -- a quick web search returned this eye-opening article... |
You could just convert it to a loop with an explicit stack. It would be a On Wed Nov 12 2014 at 11:25:33 AM Chris Pressey notifications@github.com
|
@enkiv2 Yes, rewriting it with an explicit stack would be proper engineering (but I'm all about the science this year you see, and I thought that article was a nice bit of, uh, Python science. shudder) Plus there's always that certain faint dishearteningness that comes with making edits to third-party code, no matter how nice the code and/or the license. Should I send these upstream? Should I maintain a fork? Etc. @christiaanw I'd be honoured if you (or anyone) could do something with checkerboard-layout; I have a few more ideas along those lines, but didn't want to do too many "optical" experiments because they seem... slightly out of sync with the rest of NaNoGenMo. |
I know NaNoGenMo from scouring Github for useful python code for a project I am (was?) working on. Given that there were some interesting contributions last year, such as In-Dialogue, The-Swallows and the NovelHarvesterBot I'm thinking of a hack taking off from one of these approaches to get something interesting.
The text was updated successfully, but these errors were encountered: