Discussion about this post

User's avatar
Ben K's avatar

There's a big shift from training -> reality in reward sparsity. I recall a year or two ago, reading about Minecraft being tackled by agents. They modified blocks to break earlier, because otherwise the agent would try to punch them for a bit and then wander away. This seems very telling.

We're millions of years of 'training data' defining the shape of our brains, which is something decent enough to turn functional when updated with sparse rewards. But each generational update tested for functionality in a sparse reward environment.

Given that training is reward-rich in proportion to efficiency, it seems that you can make an environment arbitrarily hostile by making it reward-sparse. Even if it's quite simple. So it's not quite that the artifacts we have created are dumb because they can't handle a teeny fraction of their weight size in code.

Expand full comment

Ready for more?