transformers
an archive of posts in this category
Mar 19, 2023 | Measuring Code Generation Abilities of GPT-4 in 10+ Languages |
---|---|
Mar 7, 2023 | Unreasonable Effectiveness of LLMs for Code Generation |
Feb 1, 2023 | Memory IO Efficiency of Multi-Query Attention |
Nov 17, 2022 | The Illustrated Tensor Parallelism |
Nov 15, 2022 | The Illustrated Attention via Einstein Summation |