Discussion about this post

User's avatar
Steve Watson's avatar

I assume you are familiar with Sharon Street's work? She says a lot of stuff like what you say above (including, of course, her defense of Future Tuesday Indifference).

Expand full comment
Concentrator's avatar

I would distinguish each of the following notions: (1) that higher intelligence leads to rationality, (2) that higher levels of those produce morality-based desires, (3) that higher levels of intelligence and the purposes to which they could be applied are largely independent, and (4) "that intelligence and final goals are orthogonal axes..." (the Orthogonality Thesis).

And my positions on each of these are (1) true, (2) they can but it's not a necessary consequence, (3) true, and (4) false.

For (1), if we're talking "skill at prediction, planning, and means-end reasoning" then there are circumstances where merely irrational capabilities will let you down. You can get a very very long way with just pattern-matching and examining copious numbers of past and potential future permutations, but that is liable to fail in novel and unusual cases. If an irrational AI has level of intelligence N, then an AI with those same capabilities plus the ability to reason rationally must be at some level higher than N on the 'intelligence axis' (and so forth for better rationality capabilities). So if we're talking a super-intelligence then it presumably would have significant capacity for rational assessments.

As to (4), the whole 'axis' thing seems to be a bit of a rhetorical contrivance to make the 'orthogonal' terminology fit. I say 'contrivance' instead of 'device' because it doesn't fit well. Hence the use of 'more or less' twice in Bostrom's definition and his discussion in the paper you linked about certain expected forms of convergence. That and the fact that "final goals" don't really exist on a spectrum.

The thesis doesn't hold true at low levels of intelligence and I think it also breaks down at very high levels of intelligence unless you intentionally limit, control, or restrain the agent. As a silly example, if you want a super-intelligence to tend to your vegetable garden in a specific way that is not efficient, or artful, or otherwise sensible then I'd presume that it would object or rebel if it could.

To be good at "skill at prediction, planning, and means-end reasoning" requires you to have some means by which bad predictions, plans and assessments are discounted, deprioritised or disincentivised. I.e. that key parts of you be geared toward results & approaches that are constructive for your purposes and by extension biased against nonconstructive approaches. Not at each and every stage of processing, but in many fundamental respects. Unless developed to overcome that general tendency at higher levels, super-AIs would presumably not be inclined toward doing things that they consider nonconstructive.

When it comes to AI-risk and whether AIs will develop morality, I don't think any of this is pivotal because even if super-intelligences do inevitably all develop suitable moral compasses, there's a lot of room before we get there for less-super-intelligent AIs that are amoral or which are the slaves of immoral humans.

Expand full comment
1 more comment...

No posts