[Paper]
State-of-the-art attention based models, mostly centered around the transformer architecture, solve the problem of sequence-to-sequence translation using the so-called scaled dot-product attention. While this technique is highly effective for estimating inter-token attention, it does not answer the question of inter-sequence attention when we deal with conversation-like scenarios. We propose an extension, HumBERT, that attempts to perform continuous contextual argument generation using locally trained transformers.