Skip to content
Related Articles

Related Articles

Improve Article

NLP | Chunk Tree to Text and Chaining Chunk Transformation

  • Last Updated : 26 Feb, 2019

We can convert a tree or subtree back to a sentence or chunk string. To understand how to do it – the code below uses first tree of the treebank_chunk corpus.

Code #1 : Joining the words in tree with space.




# Loading library    
from nltk.corpus import treebank_chunk
  
# tree
tree = treebank_chunk.chunked_sents()[0]
  
print ("Tree : \n", tree)
  
print ("\nTree leaves : \n", tree.leaves())
  
print ("\nSentence from tree : \n", ' '.join(
        [w for w, t in tree.leaves()]))

Output :

Tree : 
 (S
  (NP Pierre/NNP Vinken/NNP), /,
  (NP 61/CD years/NNS)
  old/JJ, /,
  will/MD
  join/VB
  (NP the/DT board/NN)
  as/IN
  (NP a/DT nonexecutive/JJ director/NN Nov./NNP 29/CD)
  ./.)

Tree leaves : 
 [('Pierre', 'NNP'), ('Vinken', 'NNP'), (', ', ', '), ('61', 'CD'), 
 ('years', 'NNS'), ('old', 'JJ'), (', ', ', '), ('will', 'MD'), ('join', 'VB'),
 ('the', 'DT'), ('board', 'NN'), ('as', 'IN'), ('a', 'DT'), ('nonexecutive', 'JJ'),
 ('director', 'NN'), ('Nov.', 'NNP'), ('29', 'CD'), ('.', '.')]

Sentence from tree : 
 Pierre Vinken, 61 years old, will join the board as a nonexecutive director Nov. 29 .

As in the code above, the punctuations are not right because the period and commas are treated as special words. So, they get the surrounding spaces as well. But in the code below we cab fix this using regular expression substitution.

Code #2 : chunk_tree_to_sent() function to improve Code 1






import re
  
# defining regex expression
punct_re = re.compile(r'\s([, \.;\?])')
  
def chunk_tree_to_sent(tree, concat =' '):
  
    s = concat.join([w for w, t in tree.leaves()])
    return re.sub(punct_re, r'\g<1>', s)

 
Code #3 : Evaluating chunk_tree_to_sent()




# Loading library    
from nltk.corpus import treebank_chunk
from transforms import chunk_tree_to_sent
  
# tree
tree = treebank_chunk.chunked_sents()[0]
  
print ("Tree : \n", tree)
  
print ("\nTree leaves : \n", tree.leaves())
  
print ("Tree to sentence : ", chunk_tree_to_sent(tree))

Output :

Tree : 
 (S
  (NP Pierre/NNP Vinken/NNP), /,
  (NP 61/CD years/NNS)
  old/JJ, /,
  will/MD
  join/VB
  (NP the/DT board/NN)
  as/IN
  (NP a/DT nonexecutive/JJ director/NN Nov./NNP 29/CD)
  ./.)

Tree leaves : 
 [('Pierre', 'NNP'), ('Vinken', 'NNP'), (', ', ', '), ('61', 'CD'), 
 ('years', 'NNS'), ('old', 'JJ'), (', ', ', '), ('will', 'MD'), ('join', 'VB'),
 ('the', 'DT'), ('board', 'NN'), ('as', 'IN'), ('a', 'DT'), ('nonexecutive', 'JJ'),
 ('director', 'NN'), ('Nov.', 'NNP'), ('29', 'CD'), ('.', '.')]

Tree to sentence : 
Pierre Vinken, 61 years old, will join the board as a nonexecutive director Nov. 29.

Chaining Chunk Transformation
The transformation functions can be chained together to normalize chunks and the resulting chunks are often shorter and it still holds the same meaning.

In the code below – a single chunk and an optional list of transform functions is passed to the function. This function will call each transform function on the chunk and will return the final chunk.

Code #4 :




def transform_chunk(
        chunk, chain = [filter_insignificant, 
                        swap_verb_phrase, swap_infinitive_phrase, 
                        singularize_plural_noun], trace = 0):
    for f in chain:
        chunk = f(chunk)
          
        if trace:
            print (f.__name__, ':', chunk)
              
    return chunk

 
Code #5 : Evaluating transform_chunk




from transforms import transform_chunk
  
chunk = [('the', 'DT'), ('book', 'NN'), ('of', 'IN'), 
         ('recipes', 'NNS'), ('is', 'VBZ'), ('delicious', 'JJ')]
  
print ("Chunk : \n", chunk)
  
print ("\nTransformed Chunk : \n", transform_chunk(chunk))

Output :

Chunk :  
[('the', 'DT'), ('book', 'NN'), ('of', 'IN'), ('recipes', 'NNS'), 
('is', 'VBZ'), ('delicious', 'JJ')]

Transformed Chunk : 
[('delicious', 'JJ'), ('recipe', 'NN'), ('book', 'NN')]

 Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.  

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning – Basic Level Course




My Personal Notes arrow_drop_up
Recommended Articles
Page :