transformers
an archive of posts in this category
| Mar 19, 2023 | Measuring Code Generation Abilities of GPT-4 in 10+ Languages |
|---|---|
| Mar 7, 2023 | Unreasonable Effectiveness of LLMs for Code Generation |
| Feb 1, 2023 | Memory IO Efficiency of Multi-Query Attention |
| Nov 17, 2022 | The Illustrated Tensor Parallelism |
| Nov 15, 2022 | The Illustrated Attention via Einstein Summation |