Yet another way to think about the alignment problem Consider two learning agents (humans or AIs) that have made different measurements of some system and have different interests (concerns) regarding how the system should be evolved or managed (controlled). Let’s set aside the discussion of bargaining power and the wider game both agents play and focus on how the agents can agree about a specific way of controlling the system, assuming the agents have to respect each other’s interests.
AI alignment as a translation problem
AI alignment as a translation problem
AI alignment as a translation problem
Yet another way to think about the alignment problem Consider two learning agents (humans or AIs) that have made different measurements of some system and have different interests (concerns) regarding how the system should be evolved or managed (controlled). Let’s set aside the discussion of bargaining power and the wider game both agents play and focus on how the agents can agree about a specific way of controlling the system, assuming the agents have to respect each other’s interests.