Open In App

NLP | Distributed Tagging with Execnet – Part 2

The gateway’s remote_exec() method takes a single argument that can be one of the following three types:

Code : Using the remote_tag.py module with three options




import pickle
  
if __name__ == '__channelexec__':
    tagger = pickle.loads(channel.receive())
for sentence in channel:
    channel.send(tagger.tag(sentence))

What is Pure Module?



Execnet can do a lot more, such as opening multiple channels to increase parallel processing, as well as opening gateways to remote hosts over SSH to do distributed processing.

Creating multiple channels
Multiple channels are created, one per gateway, to make the processing more parallel. Each gateway creates a new subprocess (or remote interpreter if using an SSH gateway), and one channel per gateway for communication is used. Once two channels are created, they can be combined using the MultiChannel class, which allows the user to iterate over the channels and make a receive queue to receive messages from each channel.
After creating each channel and sending the tagger, the channels are cycled through to send an even number of sentences to each channel for tagging. Then, all the responses are collected from the queue. A call to queue.get() will return a 2-tuple of (channel, message) in case it is required to know which channel the message came from. Once all the tagged sentences have been collected, gateways can be exit easily.



Code :




import itertools
  
gw1 = execnet.makegateway()
gw2 = execnet.makegateway()
  
ch1 = gw1.remote_exec(remote_tag)
ch1.send(pickled_tagger)
ch2 = gw2.remote_exec(remote_tag)
ch2.send(pickled_tagger)
  
mch = execnet.MultiChannel([ch1, ch2])
queue = mch.make_receive_queue()
channels = itertools.cycle(mch)
  
for sentence in treebank.sents()[:4]:
    channel = next(channels)
    channel.send(sentence)
tagged_sentences = []
  
for i in range(4):
    channel, tagged_sentence = queue.get()
    tagged_sentences.append(tagged_sentence)
      
print ("Length : ", len(tagged_sentences))
  
gw1.exit()
gw2.exit()

Output :

Length : 4

In the example code, only four sentences are sent, but in real life, one needs to send thousands. A single computer can tag four sentences very quickly, but when thousands or hundreds of thousands of sentences need to be tagged, sending sentences to multiple computers can be much faster than waiting for a single computer to do it all.


Article Tags :