This post is a follow-up to "A multi-disciplinary view on AI safety research". I elaborate on some arguments behind this view. Computationally tractable mathematical models of alignment are bound to be biased and blind to certain aspects of human values
For alignment, we should simultaneously use multiple theories of cognition and value
For alignment, we should simultaneously use…
This post is a follow-up to "A multi-disciplinary view on AI safety research". I elaborate on some arguments behind this view. Computationally tractable mathematical models of alignment are bound to be biased and blind to certain aspects of human values