Delete leaves in a tree with regex (Python) -


i have syntax tree, saved in text file in "lisp-style", open , closed brackets show relations. want delete leaves. example, have " (det the)" want become " det". i'm not expert of regex, wonder how handle behaviour in more complex structure, nested brackets. example of tree (in file in 1 row, indented simpler visualization):

(s   (np i)   (vp     (vp (v shot) (np (det an) (n elephant)))     (pp (p in) (np (det my) (n pajamas))))) 

i have like:

(s np   (vp     (vp v (np det n))     (pp p (np det n)))) 

something this?

re.sub("\((\w*) (\w*)\)", r"\1", t) 

where t variable holding syntax tree.

for unicode support, see comments below.


Comments

Popular posts from this blog

google chrome - Developer tools - How to inspect the elements which are added momentarily (by JQuery)? -

angularjs - Showing an empty as first option in select tag -

php - Cloud9 cloud IDE and CakePHP -