🧠

Foundation Models & LLMs

Large language models, training at scale, architecture innovations, benchmarks

The core technology layer — who is building the best models and how fast they improve

AI Summary

In the Foundation Models & LLMs sector, the most critical development right now is the rapid advancement of scaling laws and multimodal capabilities, driven by efficiency gains from key players like Stanford and Google. With 117 out of 240 expert stances supporting ongoing innovations, models are achieving significant performance improvements, as evidenced by recent papers, but this progress is tempered by growing concerns over environmental sustainability and safety risks. This shift underscores a high-activity period, with new benchmarks and experiments pushing the boundaries of AI utility. The hottest sub-topics include scaling laws in LLMs, where Percy Liang from Stanford argues that scaling compute and data yields efficiency gains, as detailed in his 2024 paper 'Lost in the Middle' with 648 citations, despite evidence of diminishing returns. Another key area is AI safety and benchmarks, led by Dario Amodei of Anthropic and Brad Lightcap, who advocate for robust evaluations to address hallucinations and biases, as outlined in Qiang Yang's 2024 survey with over 2,000 citations. Multimodal AI integration, championed by Hugo Larochelle and Bernhard Schölkopf, is also warming up, with Sergey Levine's 2023 paper on PaLM-E demonstrating enhanced real-world applications in search and robotics. A central debate in the sector centers on whether scaling laws remain the optimal path for LLM advancement. Proponents like Percy Liang and Jeff Dean from Google assert that scaling drives performance improvements and real-world applications, based on experiments showing efficiency gains. In contrast, critics such as Bernhard Schölkopf from the Max Planck Institute and Nick Frosst argue that it leads to diminishing returns and unsustainable environmental costs, as highlighted in research emphasizing the need for alternative approaches to mitigate carbon footprints. For investors, the implications are substantial: opportunities exist in backing companies focused on efficient architectures and safety measures, potentially yielding high returns amid rapid innovation. However, watch for regulatory hurdles related to environmental impacts and ethical concerns, as these could delay deployments and increase costs, with the sector's current momentum creating a narrow window for strategic investments before potential overregulation stifles growth.

Key Voices in Foundation Models & LLMs

Brad Lightcap

Brad Lightcap

OpenAI

5 posts

Trevor Darrell

Trevor Darrell

UC Berkeley

4 posts

Aravind Srinivas

Aravind Srinivas

Perplexity AI

4 posts

Casey Newton

Casey Newton

Platformer

4 posts

Guillermo Rauch

Guillermo Rauch

Vercel

2 posts

Mark Chen

Mark Chen

OpenAI

2 posts

Sam Altman

Sam Altman

OpenAI

2 posts

Emad Mostaque

Emad Mostaque

Stability AI

2 posts

Tri Dao

Tri Dao

FlashAttention

2 posts

Aidan N. Gomez

Aidan N. Gomez

Cohere

1 posts

Dario Amodei

Dario Amodei

Anthropic

1 posts

Alexandr Wang

Alexandr Wang

Scale AI

1 posts

Brad Lightcap

Brad LightcapFounder/CEOOpenAI· 2/23/2026

we're partnering with @bcg @mckinsey @accenture and @capgemini to deploy openai frontier to enterprises globally https://t.co/5dKA0LViti

Neutral

Ethan Mollick

Ethan MollickPolicyWharton School· 2/22/2026

Unicorns have always been used to measure sparks of AGI. (This was written by GPT-2 in February, 2019)

Neutral

Amjad Masad

Amjad MasadFounder/CEOReplit· 2/21/2026

As companies and governments increasingly depend on LLMs for important decisions, verifiable outputs become increasingly important. Great demo!

Supportive

Emad Mostaque

Emad MostaqueFounder/CEOStability AI· 2/21/2026

Something folk haven't figured out: 15,000 tokens/second speed and million token context windows aren't for humans They are for the AIs to talk to each other & coordinate faster than we ever could Not just a bit faster and better Orders of magnitude That's your competition

Neutral

Guillermo Rauch

Guillermo RauchFounder/CEOVercel· 2/21/2026

The future of design is… engineering. All designers at @vercel now also build, thanks to tools like @v0, Claude Code, and Cursor. They've been contributing to our frontends and apps for a while now. But over the past few months, the leap they've made is engineering the design https://t.co/5un9xjSxoY

Neutral

Demis Hassabis

Demis HassabisFounder/CEOGoogle DeepMind· 2/20/2026

This is incredible btw - using Gemini 3.1 as a city builder. I used to dream about this when painstakingly making virtual cities for simulation games like Republic.

Supportive

Aravind Srinivas

Aravind SrinivasFounder/CEOPerplexity AI· 2/19/2026

Gemini 3 Pro has been upgraded to Gemini 3.1 Pro for all Perplexity Pro and Max users (consumer and enterprise). It's the second most picked model by our Enterprise customers after Claude 4.5 Sonnet/Opus family. Enjoy! https://t.co/E5SH1WxnH5

Neutral

Guillermo Rauch

Guillermo RauchFounder/CEOVercel· 2/18/2026

AI is an amplifier of your intellect and values. A mirror of your soul. If you were a confirmation bias person, AI can be catastrophic for you. There’s some way to contort almost any prompt to give you the answer you’re looking for. The extreme version of this is AI psychosis.

Neutral

Aravind Srinivas

Aravind SrinivasFounder/CEOPerplexity AI· 2/17/2026

Sonnet 4.6 for all Perplexity Pro and Max customers available now (consumer and enterprise), across all clients - web, mobile, Comet

Neutral

Sam Altman

Sam AltmanFounder/CEOOpenAI· 2/17/2026

Happy for my brother. An absolute triumph for Benchmark.

Neutral

Emad Mostaque

Emad MostaqueFounder/CEOStability AI· 2/17/2026

New record for GPT 5.2 Pro ⏲️ Wonder when this will be days 🤔 https://t.co/scuvbDEDrr

Neutral

Aidan N. Gomez

Aidan N. GomezFounder/CEOCohere· 2/17/2026

New family of Aya models that are small a very effective at key geographies!

Neutral

Arvind Narayanan

Arvind NarayananPolicyPrinceton University· 2/15/2026

Here's an interesting visual reasoning benchmark at which 3-year olds apparently handily beat all frontier models. https://t.co/vDyAlW2BKQ https://t.co/eXfW6bRMtd

Neutral

Bret Taylor

Bret TaylorPolicyOpenAI Board· 2/14/2026

Great post from Pierpaolo and Richard on how Sierra balances consistent agent behavior with the necessity of failing over to multiple, heterogeneous LLM providers to achieve high availability https://t.co/Ox0LDTDeBs

Supportive

Graham Neubig

Graham NeubigResearcherCarnegie Mellon University· 2/13/2026

This is definitely something to be aware of both for benchmark builders and users IMO. For longer-running, more difficult tasks, the differences between which agent you use can be big, like a 10% gain in success rate when going from Claude Code to OpenHands.

Neutral

Sébastien Bubeck

Sébastien BubeckResearcherMicrosoft Research· 2/13/2026

Making progress in Quantum Field Theory with GPT-5.2. It's happening, for real.

Neutral

Percy Liang

Percy LiangResearcherStanford University· 2/11/2026

$3M to support the development of open benchmarks!

Neutral

Sam Altman

Sam AltmanFounder/CEOOpenAI· 2/11/2026

We updated GPT-5.2 (the instant model) in ChatGPT today. Not a huge change, but hopefully you find it a little better.

Neutral

Aravind Srinivas

Aravind SrinivasFounder/CEOPerplexity AI· 2/10/2026

We fixed search over your history (past threads) on Perplexity. Works really well now. https://t.co/fsDwXcBCz7

Neutral

Trevor Darrell

Trevor DarrellResearcherUC Berkeley· 2/9/2026

A truely generative meta-model of activations, for steering, probing, and understanding LLMs at scale!

Neutral

Aravind Srinivas

Aravind SrinivasFounder/CEOPerplexity AI· 2/9/2026

We've upgraded Perplexity's Advanced Deep Research harness to run with Opus 4.6 (from last week's version with Opus 4.5). This furthers our lead on Google's DSQA benchmark over other alternatives. Rolled out to all Max users immediately, and slowly rolling to all Pro users. https://t.co/8wmfBxkwSP

Neutral

Brad Lightcap

Brad LightcapFounder/CEOOpenAI· 1/30/2026

we wrote about our in-house data agent used by ~4k people, from product/eng, to research, GTM, finance, and more it was built with codex, and runs on codex, gpt-5, and our evals & embeddings APIs like codex, it works like a teammate you can collab with https://t.co/sjPGis8CHk

Neutral

Jie Tang

Jie TangResearcherTsinghua University· 1/27/2026

GLM gets the same performance with Claude 4.5

Neutral

Anima Anandkumar

Anima AnandkumarResearcherCaltech / NVIDIA· 1/16/2026

Wonderful collaboration with @francesarnold We employed genSLM the first genome scale language model to design functional and versatile enzymes.

Supportive

Yarin Gal

Yarin GalResearcherUniversity of Oxford· 1/12/2026

The dangers of extrapolating scaling laws

Neutral

Andy Jassy

Andy JassyFounder/CEOAmazon· 1/10/2026

I can’t wait for tonight’s rubber match to the Bears-Packers trilogy this season. Both of the regular season games were fantastic (the first settled on a late interception of Caleb Williams, and the second in OT on a Caleb bomb to DJ Moore). Caleb Williams' first playoff game, https://t.co/9tLLmrG6Uf

Supportive

Brad Lightcap

Brad LightcapFounder/CEOOpenAI· 1/8/2026

introducing openai for healthcare it includes chatgpt for healthcare, as well as models optimized for care providers and workflows both our APIs and chatgpt support HIPAA compliance requirements we're partnering with HCA, boston children's hospital, MSK, stanford health and

Neutral

Zico Kolter

Zico KolterPolicyOpenAI Board / CMU· 1/4/2026

I've decided to release a minimal, free online version of my upcoming "10-202 - Intro to Modern AI" course, starting January 26: https://t.co/ptnrNmVPyf. As a brief summary, this course introduces students to the elements of modern AI systems: you'll build and train a simple LLM

Neutral

Alexandr Wang

Alexandr WangFounder/CEOScale AI· 1/2/2026

a master class on the physics of language models by FAIR's @ZeyuanAllenZhu

Neutral

Sergey Levine

Sergey LevineResearcherUC Berkeley· 12/30/2025

Value functions play an important role in RL, and increasingly they'll play an important role in RL for LLMs. This new paper led by @rohin_manvi is one step in this direction: using value functions to optimize test-time compute with adaptive computation.

Neutral