TASM Notes 011

Fri Mar 15, 2024

Bit late on this one; it's actually the meeting notes for last week.

Pre-Meeting Chatting

Zvi's Update

The Talk - Mesaopetimizers and Robustness

Term Check

Interest: "Robustness Guarantee"

The Paper: Risks from Learned Optimization

Goal Misgeneralization

Concrete example:


What's the difference between Robustness and Alignment?

Mesaoptimizers/inner misalignment in the wild

Pub Time

Presumably, they all discussed the pubbed items from above, but I didn't end up joining this time.

