Google & IDSIA’s Block-Recurrent Transformer Dramatically Outperforms Transformers Over Very Long Sequences | Synced

A team from Google Research and the Swiss AI Lab IDSIA proposes the Block-Recurrent Transformer, a novel long-sequence processing approach that has the same computation time and parameter count cos...

By · · 1 min read

Source: Synced | AI Technology & Industry Review

A team from Google Research and the Swiss AI Lab IDSIA proposes the Block-Recurrent Transformer, a novel long-sequence processing approach that has the same computation time and parameter count costs as a conventional transformer layer but achieves significant perplexity improvements in language modelling tasks over very long sequences.