Equall & Apple’s Revolutionizing Transformers: One Wide Feedforward for Unprecedented Efficiency and Accuracy | Synced
A collaborative research effort from Equall and Apple delves into the role of the FFN and uncovers a surprising revelation: despite consuming a significant portion of the model’s parameters, ...
Source: Synced | AI Technology & Industry Review
A collaborative research effort from Equall and Apple delves into the role of the FFN and uncovers a surprising revelation: despite consuming a significant portion of the model’s parameters, the FFN exhibits high redundancy. As a result, the researchers propose sharing a single FFN across both the encoder and decoder, thereby reducing the parameter count while causing only a modest drop in accuracy.