One thing that occurs to me is that even if we had a list of traits or measurable attributes that probabilistically correspond with consciousness, we’d probably still have a Goodhart’s Law problem due to programmability of AI systems. E.g. SuperAI would just claim “see... training run #8356 exhibited a lower than 10% chance of consciousness!”
I find this worrying because right now we seem pretty far from even having that list of proxy measures to begin with!
Great article!
One thing that occurs to me is that even if we had a list of traits or measurable attributes that probabilistically correspond with consciousness, we’d probably still have a Goodhart’s Law problem due to programmability of AI systems. E.g. SuperAI would just claim “see... training run #8356 exhibited a lower than 10% chance of consciousness!”
I find this worrying because right now we seem pretty far from even having that list of proxy measures to begin with!