Tuesday, December 10, 2024

Can an AI agent make you play nice?

Internet veterans are aware of Betteridge's Law of Headlines, according to which the question in the title of the post should be answered with a resounding "No". Well, I'm here to be the rebel! I'll try to convince you that the answer is in fact Yes!

What does "playing nice" even mean? Well, the word "play" implies games, so we are talking about game theory here, and "playing nice" therefore means "to cooperate". I've been writing about how to understand the ubiquity of cooperation in the biosphere several times in this blog (see here and here) because people often (mistakingly) argue that cooperative behavior cannot evolve (meaning, evolve via Darwinian mechanisms) because evolution is selfish and cooperation is not.  Well, that's just not true. Cooperation without communication cannot evolve, but when communication is possible, then it's a no-brainer. I will write about this in more detail when the relevant article on that topic comes out (in the meantime, you can read this letter that foreshadows that theory). Needless to say, we do not know of any cooperation in the biosphere that does not involve a form of communication. 

But "playing nice" also means something specific within society, in particular when it comes to maintaining a public good, or a resource, that is finite. Everyone knows that maintaining such public goods (such as water reserves, forests, Chilean seabass, or, you know, a livable climate) requires a certain amount of discipline by its users: it is possible to abuse the resource by overuse, overfishing, deforestation, etc. to make a short term profit, but forgoing the use of the resource for future generations. Commonly, this dilemma—the tension between short term profits and long-term deficits—is referred to as the "tragedy of the commons", a term popularized by the ecologist Garrett Hardin. While Hardin popularized the concept, I should not be remiss to note that he was also completely wrong about some aspects of his analysis, as pointed out, for example, by Elinor Ostrom who coined the concept of "public land". For example, Hardin often warned of overpopulation from a frankly xenophobic point of view, even as we now know that Earth is never going to face an overpopulation problem (but rather the opposite).

In evolutionary game theory (EGT) we can study the tragedy of the commons theoretically and in simulations. Specifically, we can construct a game with \(k+1\) players (the "+1" is the "focal player" whose decision to cooperate or defect we would like to influence, and the \(k\) others are "peripheral" players that are in a group with the focal player). All \(k+1\) players can voluntarily "invest" into a common good, and this good is then amplified by a factor \(r\) (think of the growth of a forest, or fishery, when it is appropriately maintained using the investment). The resulting "capital" is then distributed equally to all members of the group, including those that did not invest into the public good. Because of this, the rational strategy in this game is not to cooperate, but rather to profit from the public good without having paid a cost. I'm sure you've seen such behavior, haven't you?

Illustration of the Public Goods Game with k=4 (5 players). In this illustration, three of the players contribute one token, and the total contributed is amplified to 15 because the synergy factor here is 5. 

Mathematically, you can show that if the synergy factor \(r\) is high enough (namely larger than the group size), then it becomes advantageous to invest even if no one else does. But you can quickly convince yourself that such synergy factors are illusory even in small groups of, say 5. Never mind millions. The question then is: what can we do to make cooperation profitable even when synergies are below the critical value, that is \(r<k+1\)? 

You might think that punishing defectors would be a strategy to entice players to cooperate, but as we have previously shown, that doesn't really work. (I really should have done a blog post about this work, which has some fascinating analogies to critical phenomena in condensed matter systems). What happens when you introduce punishment is that you create a system in which transitions (from cooperation to defection and vice versa) become metastable. But it does not move the barrier to cooperation (reduce the value of the critical \(r\)).

So what can be done? In a recent preprint, my colleague (and steadfast collaborator in all things game theory, Arend Hintze at Falun University in Sweden) and I have looked at what would happen if some of the players in a group are not human at all, but rather AI controlled agents. When we use the term, you should not think of some sort of Skynet-controlled robot, but in general imagine an automaton that makes decisions based on an algorithm. This is not altogether far-fetched: self-driving cars, for example, are either here already (see Waymo) or eternally being promised to be around the corner (by some other company).  It is possible to envision dilemmas that can occur when human drivers are interacting with such agents. For example, there are common situations at intersections where courtesy can lead to smooth flowing traffic, when selfish behavior can create traffic bottlenecks. Being courteous could cost time to the courteous driver or AI, but will profit everyone in the long run.  Can the presence of AI agents influence how a human driver might behave? Can they lower the barrier to cooperation?

You are probably thinking "That must depend on the AI's programming", and you would be right. We looked at three scenarios on how and what controls the AI agent's behavior:

1. Institutionally proscribed cooperation: All AI agents are programmed to always be cooperating, no matter what the human players do. Because companies might not want to take that "hit", this would have to be enforced by regulations.

2. Player-controlled probabilistic cooperation: The probability that the AI agent cooperates is controlled by the player themselves. For example, a player might force all the AI agents in their neighborhood to cooperate, as this would allow them to "rip them off" (the typical temptation to defect). 

3. AI agent mimics player. In this scenario, the AI agent observes the player agent, and "copies" that behavior. Thus, if the player is being mean, the AI agent is being mean, whereas if the player cooperates, the agent will do so as well. You can see this strategy as a form of "Tit-for-Tat", if you will. 

Arend simulated all three scenarios (I did the math, that's kind of our peculiar synergy) with populations of 5 players in a group, in a well-mixed populations (so that neighborhoods change every generation) with a variable fraction of AI agents \(\rho_A\) within each group (\(0\leq \rho_A\leq1\)).

Here is what he found:

In scenario 1, the overall levels of cooperation increased as a function of increasing \(\rho_A\), but the critical \(r\) was unaffected: the barrier to cooperation was not lowered for the human player. 

In scenario 2, all human players quickly found out that they should force the AI agents to cooperate unconditionally (dumb cooperators), but this did not lower the barrier to cooperation either, it remained at \(r_c=k+1\), just as theory predicted.

Scenario 3, however, was different. The larger \(\rho_A\), the lower the value of \(r\) at which the population transitioned to cooperation, as we can see in the plot below. 

Fraction of cooperating players \(p_C\) as a function of synergy factor \(r\), in groups of 5, for different values of the AI agent density \(\rho_A\). The curve to the very right has \(\rho_A=0\) which implies a transition at \(r=k+1=5\). The curves to the left are obtained with increasing \(\rho_A\). 

Well, that smells like success! Increase the fraction of mimicking AI agents, and the human agents are coerced to cooperate at much lower synergy factors. In fact, we can rev up the math engine (meaning, my head) to calculate the predicted critical synergy in this scenario (the \(r\) at which the blue curves in the figure above cross 0.5). Math says:

\(r_c=\frac{k+1}{\rho_A k +1}\)                   (1)

This is a very simple formula, and it came as a bit of a surprise that the dependence on the number of cooperators in the group actually canceled. So, how does this prediction fare against simulations? You can see that in the figure below.

Critical synergy (crossover points in the previous figure) as a function of the agent density \(\rho_A\). The blue-to-green crosses are obtained from simulations (and connected to guide the eye), while theory (Equation (1)) is given by the dashed line.

In conclusion,  AI agents that mimic player behavior can in fact influence player behavior (while this did not happen for scenarios 1 and 2). This strategy appears to be very simple, but keep in mind mimicking human behavior is not trivial, since humans don't carry around a placard that identifies them as a cooperator or a defector (in biology, this does sometimes happen, and the evolutionary dynamics that describes this is known as the "green beard effect"). Thus, an AI agent may have to learn how to interpret a human player's actions and infer whether they are a cooperator (and therefore cooperate with them) or a defector (and thus not cooperate with them). But in any case, they have to obtain a sufficient amount of information in order to make that decision, and this can be complicated in real-world situations. 

So what have learned here? Turns out AI agents can make you play nice! Now, in a better world, we would not need such devices to coerce recalcitrant players. After all, there is a rule "treat others as you would like to be treated", and this golden rule would certainly give rise to universal cooperation. But our world is not perfect, but rather is saturated with cheating defectors that are perfectly happy to take advantage of public goods for short personal gains. So some of us need to be "shown the way", and it turns out that surrounding them with mimicking AI agents could just do the trick! If only that would work in all of our social interactions. 

Of course there are limitations to our approach when trying to translate to actual human populations interacting with technology, and they are described in the paper: 

Arend Hintze and Christoph Adami, Promoting Cooperation in the Public Goods Game using Artificial Intelligent Agents, arXiv:2412.05450

However, we believe the general trends that we have studied would likely carry over, as the dynamics (in the end) are very simple. 
















No comments:

Post a Comment