Rewards or Value?

March 18, 2023 Cognitive Insights No Comments

Are people driven more by direct rewards or broader value? Should A.I. be driven rather by one or the other? What about end values?

Zero-sum values

These are nothing but the sum of individual rewards.

For instance, one earns tokens on several timesteps. The eventual zero-sum value is the sum of all tokens earned ― nothing more or less. Thus, the focus primarily lies on the tokens, to which the value is secondary. In theories of Reinforcement Learning (one kind of A.I.; see more), the emphasis frequently lies on the direct rewards. Long-term value is seen as the maximization of all rewards over the trajectory. This is OK in many cases.

But if humans are deeply involved, are zero-sum values also preferable?

Are people driven more by rewards or value?

Homo economicus (the one from many books of economy) is reward driven. This translates at school into points, grades, prestigious institutions, and diplomas. Later on, the translation is equally apparent.

Nevertheless, homo realisticus is value driven. Of course, nature wants us to survive day after day. The reward is to eat, not be eaten, and survive every day. Be assured nature knows that.

But nature has also made us humans more flexible by allowing us to see broader pictures, learn from the past, and plan ahead. That is called intelligence. This characteristic made us more resilient, exploring, and eventually more value-driven than any other creature ― whatever these values may be.

This is not about good or bad ― yet.

Therefore, choose your values well.

Simply sticking to maximization of rewards may not be the best option. Actually, it almost certainly leads to a detrimental future.

Of course, the rewards thing is also still in our nature and well in our capabilities. We can follow nature and meanwhile kill each other, as well as eventually the planet ― especially with the technology of today and the near future. This is still within nature’s calling, but it is not nature’s ideal endgame.

In my view, the best end value is no zero-sum value, but Compassion, basically ― an explicit choice. With the focus on this, concrete rewards are not readily efficient since they continually draw the focus to themselves and the surface level. Compassionately, one needs to supersede individual rewards on a long and winding road. The end result is one big reward: Compassion itself.

Human-A.I. value alignment

This is appropriately deemed to be crucial by many.

However…

Transforming humans into more robot-like entities is one way to reach human-A.I. value alignment. I fear that we might be going this way in how we use technology in general, specifically present-day A.I. technology.

Reinforcement Learning plays a crucial role in this regard. Much talk in it is about ‘rewards.’ It’s probably better not to use this term altogether and speak of nudging, feedback, or even suggestions if you like. Also, in anything human depth-related, the focus should lie on value ― not as sum of rewards but as an entity in its own right.

Of course – and, I hope, you too – I’m for the second option, namely: developing A.I in a human-compatible way from the start on ― which is yesterday. Even more, A.I. may help us humans to become more humane.

That is my endeavor, from which I have written The Journey Towards Compassionate A.I.

Leave a Reply

Related Posts

The Difference between Being Alone and Being Lonely

Being (= feeling) lonely is as much related to you as to your situation. Most of all, it is about a meaningful relationship with yourself. A person can be very alone, yet not feel lonely – and a person can feel very lonely, yet not be alone. The fact that this is possible, means that Read the full article…

Being Fierce

‘Fierceness’ is meant here not as anger, aggression, menace, or domination. It is the flame of someone who stands unwavering and unapologetically with truth and Compassion — without retreat, even without armor if need be. This blog explores the human beauty of fierceness: not as hardness, but as a sacred presence that does not flinch, Read the full article…

The Message of Awe

Some experience it daily, others just sporadically. Everybody knows awe, yet it remains strangely elusive. Diminishing the elusiveness may detract from the mystery. It may also Open one up to even more depth. Awe cannot be coerced ― yet it can be invited. For instance, through the environment: the mountains, the sea, a sunset or Read the full article…

Translate »