python - NLTK chunked parse tree, save it into a file and loading it with CorpusReader class -


let's have chunked corpus below, , saved in file called test.txt

[rapunzel/nnp] let/vbd down/rp [her/pp$ long/jj golden/jj hair/nn] 

then can load chunkedcorpusreader.

>>> nltk.corpus.reader import chunkedcorpusreader >>> reader = chunkedcorpusreader('.','test.txt') >>> reader.chunked_sents()[0] tree('s', [tree('np', [('rapunzel', 'nnp')]), ('let', 'vbd'), ('down', 'rp'), tree('np', [('her', 'pp$'), ('long', 'jj'), ('golden', 'jj'), ('hair', 'nn')])]) >>> print(reader.chunked_sents()[0]) (s   (np rapunzel/nnp)   let/vbd   down/rp   (np her/pp$ long/jj golden/jj hair/nn)) 

and made change on tree object, say, switched chunk tag np npp , called new.

>>> print(new) (s   (npp rapunzel/nnp)   let/vbd   down/rp   (npp her/pp$ long/jj golden/jj hair/nn)) 

and want save new tree in file , load chunkedcorpusreader or other readers, did test.txt. however, couldn't find way save nltk tree object in file, , moreover, read file. can help?

the default conversion string, print gave you, not bad: merges words pos tags, , indents new lines properly. since file.write() doesn't automatically convert string, must pass str(newtree) file's write method.

for more control on appearance of tree's string representation, use tree method pformat(). note tree.pformat() called tree.pprint() in earlier versions of nltk; in latest version, tree.pformat() returns string while tree.pprint() writes stdout.

if want tree delimited square brackets, add option parens="[]" pformat().

>>> print(new.pformat(parens="[]")) [s   [np rapunzel/nnp]   let/vbd   down/rp   [np her/pp$ long/jj golden/jj hair/nn]] 

Comments

Popular posts from this blog

google chrome - Developer tools - How to inspect the elements which are added momentarily (by JQuery)? -

angularjs - Showing an empty as first option in select tag -

php - Cloud9 cloud IDE and CakePHP -