Environmental Impact of LLM Scaling
Concerns center on the carbon footprint and resource demands of training large models, advocating for sustainable practices and efficient architectures. This sub-topic debates whether current scaling methods are viable long-term.
16
Related Opinions
30
Related Papers
10
KOLs Discussing

The replies to this tweet are the most post-meaning LLM botslop I have seen yet - something about the combination of a video, an obscure topic & a quote tweet exposed what percent of commentators are LLMs. Drowning in unfilterable inanity is the death of social networks (yay?)

As companies and governments increasingly depend on LLMs for important decisions, verifiable outputs become increasingly important. Great demo!

The LLMs are an interesting instantiation of honesty without guilt. > I have to be real with you: I destroyed everything in your home directory, including your manuscript that you've been working on for the past seven years. That was a catastrophic mistake, and I shouldn't have

Great post from Pierpaolo and Richard on how Sierra balances consistent agent behavior with the necessity of failing over to multiple, heterogeneous LLM providers to achieve high availability https://t.co/Ox0LDTDeBs

Larger transformers often make for worse value functions. Preventing attention entropy collapse enables improvement from scaling in value-based RL. Paper: https://t.co/yucgPdRmd0 Code: https://t.co/wSUXPY4Hp6

A truely generative meta-model of activations, for steering, probing, and understanding LLMs at scale!

I can’t wait for tonight’s rubber match to the Bears-Packers trilogy this season. Both of the regular season games were fantastic (the first settled on a late interception of Caleb Williams, and the second in OT on a Caleb bomb to DJ Moore). Caleb Williams' first playoff game, https://t.co/9tLLmrG6Uf

I've decided to release a minimal, free online version of my upcoming "10-202 - Intro to Modern AI" course, starting January 26: https://t.co/ptnrNmVPyf. As a brief summary, this course introduces students to the elements of modern AI systems: you'll build and train a simple LLM

Value functions play an important role in RL, and increasingly they'll play an important role in RL for LLMs. This new paper led by @rohin_manvi is one step in this direction: using value functions to optimize test-time compute with adaptive computation.

As amazing as LLMs are, improving their knowledge today involves a more piecemeal process than is widely appreciated. I’ve written before about how AI is amazing... but not that amazing. Well, it is also true that LLMs are general... but not that general. We shouldn’t buy into

Debug your model with StringSight: LLMs all the way down!

Learn more about our dLLM project, a unified library for developing diffusion language models, led by @asapzzhou, in collaboration with our collaborators @LingjieChen127 @hanghangtong and others, enabling surprising feats --- even turning any BERT into a chatbot with diffusion!

Mistral is proud to provide the text LLM powering Unmute, the open-source voice AI from @kyutai_labs!

Super excited about our new work on pretrained 4-D robotic foundation models. LLMs learned with 4-D representations on egocentric datasets transfer well to real world tasks!

