bird.makeup

near

@nearcyan · Twitter · 2023-03-14 02:22 UTC

"We offer no explanation as to why these architectures seem to work; we attribute their success, as all else, to divine benevolence" - Noam Shazeer (second author of the transformer paper, now CEO of Character AI) from the SwiGLU paper: https://arxiv.org/abs/2002.05202v1