Monday, February 22, 2016

Quadrant Model of Reality Book 14 Game Theory

QMRCrosstrack, the "unique track switching game", is an abstract strategy game created by Shoptaugh Games in 1994. Players place special track pieces onto an irregular octagon board, winning by being the first to create an unbroken path between two opposite sides.

It is shaped as a quadrant grid

QMRIn graph theory, a haven is a certain type of function on sets of vertices in an undirected graph. If a haven exists, it can be used by an evader to win a pursuit-evasion game on the graph, by consulting the function at each step of the game to determine a safe set of vertices to move into. Havens were first introduced by Seymour & Thomas (1993) as a tool for characterizing the treewidth of graphs.[1] Their other applications include proving the existence of small separators on minor-closed families of graphs,[2] and characterizing the ends and clique minors of infinite graphs.[3][4]

It is the shape of quadrants

QMRPurification theorem
From Wikipedia, the free encyclopedia
In game theory, the purification theorem was contributed by Nobel laureate John Harsanyi in 1973.[1] The theorem aims to justify a puzzling aspect of mixed strategy Nash equilibria: that each player is wholly indifferent amongst each of the actions he puts non-zero weight on, yet he mixes them so as to make every other player also indifferent.

The mixed strategy equilibria are explained as being the limit of pure strategy equilibria for a disturbed game of incomplete information in which the payoffs of each player are known to themselves but not their opponents. The idea is that the predicted mixed strategy of the original game emerge as ever improving approximations of a game that is not observed by the theorist who designed the original, idealized game.

The apparently mixed nature of the strategy is actually just the result of each player playing a pure strategy with threshold values that depend on the ex-ante distribution over the continuum of payoffs that a player can have. As that continuum shrinks to zero, the players strategies converge to the predicted Nash equilibria of the original, unperturbed, complete information game.

The result is also an important aspect of modern day inquiries in evolutionary game theory where the perturbed values are interpreted as distributions over types of players randomly paired in a population to play games.

Consider the Hawk-Dove game shown here. The game has two pure strategy equilibria (Defect, Cooperate) and (Cooperate, Defect). It also has a mixed equilibrium in which each player plays Cooperate with probability 2/3.

Suppose that each player i bears an extra cost ai from playing Cooperate, which is uniformly distributed on [-A, A]. Players only know their own value of this cost. So this is a game of incomplete information which we can solve using Bayesian Nash equilibrium. The probability that ai ≤ a* is (a* + A)/2A. If player 2 Cooperates when a2 ≤ a*, then player 1's expected utility from Cooperating is - a1 + 3(a* + A)/2A + 2(1-(a* + A)/2A); his expected utility from Defecting is 4(a* + A)/2A. He should therefore himself Cooperate when a1 ≤ 2 - 3(a*+A)/2A. Seeking a symmetric equilibrium where both players cooperate if ai ≤ a*, we solve this for a* = 1/(2+3/A). Now we have worked out a*, we can calculate the probability of each player playing Cooperate as

\textstyle Pr(a_i \le a^*) = \frac{\frac{1}{2+3/A}+A}{2A} = \frac{A}{4A^{2}+6A}+\frac{1}{2}.
As A → 0, this approaches 2/3 - the same probability as in the mixed strategy in the complete information game.

Thus, we can think of the mixed strategy equilibrium as the outcome of pure strategies followed by players who have a small amount of private information about their payoffs.

Technical Details[edit]
Harsanyi's proof involves the strong assumption that the perturbations for each player are independent of the other players. However, further refinements to make the theorem more general have been attempted.[2][3]

The main result of the theorem is that all the mixed strategy equilibria of a given game can be purified using the same sequence of perturbed games. However, in addition to independence of the perturbations, it relies on the set of payoffs for this sequence of games being of full measure. There are games, of a pathological nature, for which this condition fails to hold.

The main problem with these games falls into one of two categories: (1) various mixed strategies of the game are purified by different sequences of perturbed games and (2) some mixed strategies of the game involve weakly dominated strategies. No mixed strategy involving a weakly dominated strategy can be purified using this method because if there is ever any non-negative probability that the opponent will play a strategy for which the weakly dominated strategy is not a best response, then one will never wish to play the weakly dominated strategy. Hence, the limit fails to hold because it involves a discontinuity.[4]

QMRCoalition-proof Nash equilibrium
From Wikipedia, the free encyclopedia
The concept of coalition-proof Nash equilibrium applies to certain "noncooperative" environments in which players can freely discuss their strategies but cannot make binding commitments.[1] It emphasizes the immunization to deviations that are self-enforcing. While the best-response property in Nash equilibrium is necessary for self-enforceability, it is not generally sufficient when players can jointly deviate in a way that is mutually beneficial.

The Strong Nash equilibrium is criticized as too "strong" in that the environment allows for unlimited private communication.[1] In the coalition-proof Nash equilibrium the private communication is limited.[1]

Formal definition.[1] (i) In a single player, single stage game Γ, s*€S is a Perfectly Coalition-Proof Nash equilibrium if and only if s* maximizes g1(s). (ii) Let (n,t) ≠ (1,1). Assume that Perfectly Coalition-Proof Nash equilibrium has been defined for all games with m players and s stages, where (m, s) ≤ (n, t), and (m, s) ≠ (n, t). (a) For any game Γ with n players and t stages, s*€S is perfectly self-enforcing if, for all J€J, sJ* is a Perfectly Coalition-Proof Nash equilibrium in the game Γ/s*-J, and if the restriction of s* to any proper subgame forms a Perfectly Coalition-Proof Nash equilibrium in that subgame. (b) For any game Γ with n players and t stages, s*€S is a Perfectly Coalition-Proof Nash equilibrium if it is perfectly self-enforcing, and if there does not exist another perfectly self-enforcing strategy vector s€S such that g1(s)> g1(s*) for all i= 1,...,n.

Less formal: At first all players are in a room deliberating their strategies. Then one by one, they leave the room fixing their strategy and only those left are allowed to change their strategies, both individually and together.

The coalition-proof Nash equilibrium refines the Nash equilibrium by adopting a stronger notion of self-enforceability that allows multilateral deviations.

Parallel to the idea of correlated equilibrium as an extension to Nash equilibrium when public signalling device is allowed, coalition-proof equilibrium is defined by Diego Moreno and John Wooders.[2]

The inventor of game theory John Nash, had a movie "A Beautiful Mind" made about him

QMRA strong Nash equilibrium is a Nash equilibrium in which no coalition, taking the actions of its complements as given, can cooperatively deviate in a way that benefits all of its members.[1] While the Nash concept of stability defines equilibrium only in terms of unilateral deviations, strong Nash equilibrium allows for deviations by every conceivable coalition.[2] This equilibrium concept is particularly useful in areas such as the study of voting systems, in which there are typically many more players than possible outcomes, and so plain Nash equilibria are far too abundant.

The strong Nash concept is criticized as too "strong" in that the environment allows for unlimited private communication. In fact, strong Nash equilibrium has to be Pareto-efficient. As a result of these requirements, Strong Nash rarely exists in games interesting enough to deserve study. Nevertheless, it is possible for there to be multiple strong Nash equilibria. For instance, in Approval voting, there is always a strong Nash equilibrium for any Condorcet winner that exists, but this is only unique (apart from inconsequential changes) when there is a majority Condorcet winner.

A relatively weaker yet refined Nash stability concept is called coalition-proof Nash equilibrium (CPNE) [2] in which the equilibria are immune to multilateral deviations that are self-enforcing. Every correlated strategy supported by iterated strict dominance and on the Pareto frontier is a CPNE.[3] Further, it is possible for a game to have a Nash equilibrium that is resilient against coalitions less than a specified size k. CPNE is related to the theory of the core.

Confusingly, the concept of a strong Nash equilibrium is unrelated to that of a weak Nash equilibrium. That is, a Nash equilibrium can be both strong and weak, or neither.

QMRIn game theory, a symmetric game is a game where the payoffs for playing a particular strategy depend only on the other strategies employed, not on who is playing them. If one can change the identities of the players without changing the payoff to the strategies, then a game is symmetric. Symmetry can come in different varieties. Ordinally symmetric games are games that are symmetric with respect to the ordinal structure of the payoffs. A game is quantitatively symmetric if and only if it is symmetric with respect to the exact payoffs.

Symmetry in 2x2 games[edit]
E F
E a, a b, c
F c, b d, d
Only 12 out the 144 ordinally distinct 2x2 games are symmetric. However, many of the commonly studied 2x2 games are at least ordinally symmetric. The standard representations of chicken, the Prisoner's Dilemma, and the Stag hunt are all symmetric games. Formally, in order for a 2x2 game to be symmetric, its payoff matrix must conform to the schema pictured to the right.

The requirements for a game to be ordinally symmetric are weaker, there it need only be the case that the ordinal ranking of the payoffs conform to the schema on the right.

Symmetry and equilibria[edit]
Nash (1951) shows that every symmetric game has a symmetric mixed strategy Nash equilibrium. Cheng et al. (2004) show that every two-strategy symmetric game has a (not necessarily symmetric) pure strategy Nash equilibrium.

Uncorrelated asymmetries: payoff neutral asymmetries[edit]
Symmetries here refer to symmetries in payoffs. Biologists often refer to asymmetries in payoffs between players in a game as correlated asymmetries. These are in contrast to uncorrelated asymmetries which are purely informational and have no effect on payoffs (e.g. see Hawk-dove game).
QMRChicken[edit]
Swerve Straight
Swerve Tie, Tie Lose, Win
Straight Win, Lose Crash, Crash
Fig. 1: A payoff matrix of Chicken
Swerve Straight
Swerve 0, 0 -1, +1
Straight +1, -1 -10, -10
Fig. 2: Chicken with numerical payoffs
A formal version of the game of Chicken has been the subject of serious research in game theory.[6] Two versions of the payoff matrix for this game are presented here (Figures 1 and 2). In Figure 1, the outcomes are represented in words, where each player would prefer to win over tying, prefer to tie over losing, and prefer to lose over crashing. Figure 2 presents arbitrarily set numerical payoffs which theoretically conform to this situation. Here, the benefit of winning is 1, the cost of losing is -1, and the cost of crashing is -10.

Both Chicken and Hawk-Dove are anti-coordination games, in which it is mutually beneficial for the players to play different strategies. In this way, it can be thought of as the opposite of a coordination game, where playing the same strategy Pareto dominates playing different strategies. The underlying concept is that players use a shared resource. In coordination games, sharing the resource creates a benefit for all: the resource is non-rivalrous, and the shared usage creates positive externalities. In anti-coordination games the resource is rivalrous but non-excludable and sharing comes at a cost (or negative externality).

Because the loss of swerving is so trivial compared to the crash that occurs if nobody swerves, the reasonable strategy would seem to be to swerve before a crash is likely. Yet, knowing this, if one believes one's opponent to be reasonable, one may well decide not to swerve at all, in the belief that he will be reasonable and decide to swerve, leaving the other player the winner. This unstable situation can be formalized by saying there is more than one Nash equilibrium, which is a pair of strategies for which neither player gains by changing his own strategy while the other stays the same. (In this case, the pure strategy equilibria are the two situations wherein one player swerves while the other does not.)

Hawk-Dove[edit]
Hawk Dove
Hawk (V−C)/2, (V−C)/2 V, 0
Dove 0, V V/2, V/2
Fig. 3: Hawk-Dove game
Hawk Dove
Hawk X, X W, L
Dove L, W T, T
Fig. 4: General Hawk-Dove game
In the biological literature, this game is referred to as Hawk-Dove. The earliest presentation of a form of the Hawk-Dove game was by John Maynard Smith and George Price in their paper, "The logic of animal conflict".[7] The traditional [5][8] payoff matrix for the Hawk-Dove game is given in Figure 3, where V is the value of the contested resource, and C is the cost of an escalated fight. It is (almost always) assumed that the value of the resource is less than the cost of a fight, i.e., C > V > 0. If C ≤ V, the resulting game is not a game of Chicken but is instead a Prisoner's Dilemma.

Hawk-Dove transforming into Prisoner's Dilemma. As C becomes smaller than V, the mixed strategy equilibrium moves to the pure strategy equilibrium of both players playing hawk (see Replicator dynamics).
The exact value of the Dove vs. Dove playoff varies between model formulations. Sometimes the players are assumed to split the payoff equally (V/2 each), other times the payoff is assumed to be zero (since this is the expected payoff to a war of attrition game, which is the presumed models for a contest decided by display duration).

While the Hawk-Dove game is typically taught and discussed with the payoffs in terms of V and C, the solutions hold true for any matrix with the payoffs in Figure 4, where W > T > L > X.[8]

Hawk-Dove variants[edit]
Biologists have explored modified versions of classic Hawk-Dove game to investigate a number of biologically relevant factors. These include adding variation in resource holding potential, and differences in the value of winning to the different players,[9] allowing the players to threaten each other before choosing moves in the game,[10] and extending the interaction to two plays of the game.[11]

Pre-commitment[edit]
One tactic in the game is for one party to signal their intentions convincingly before the game begins. For example, if one party were to ostentatiously disable their steering wheel just before the match, the other party would be compelled to swerve.[12] This shows that, in some circumstances, reducing one's own options can be a good strategy. One real-world example is a protester who handcuffs himself to an object, so that no threat can be made which would compel him to move (since he cannot move). Another example, taken from fiction, is found in Stanley Kubrick's Dr. Strangelove. In that film, the Russians sought to deter American attack by building a "doomsday machine," a device that would trigger world annihilation if Russia was hit by nuclear weapons or if any attempt were made to disarm it. However, the Russians had planned to signal the deployment of the machine a few days after having set it up, which, because of an unfortunate course of events, turned out to be too late.

Players may also make non-binding threats to not swerve. This has been modeled explicitly in the Hawk-Dove game. Such threats work, but must be wastefully costly if the threat is one of two possible signals ("I will not swerve"/"I will swerve"), or they will be costless if there are three or more signals (in which case the signals will function as a game of "Rock, Paper, Scissors").[10]

QMRIterated snowdrift[edit]
Researchers from the University of Lausanne and the University of Edinburgh have suggested that the "Iterated Snowdrift Game" may more closely reflect real-world social situations. Although this model is actually a chicken game, it will be described here. In this model, the risk of being exploited through defection is lower, and individuals always gain from taking the cooperative choice. The snowdrift game imagines two drivers who are stuck on opposite sides of a snowdrift, each of whom is given the option of shoveling snow to clear a path, or remaining in their car. A player's highest payoff comes from leaving the opponent to clear all the snow by themselves, but the opponent is still nominally rewarded for their work.

This may better reflect real world scenarios, the researchers giving the example of two scientists collaborating on a report, both of whom would benefit if the other worked harder. "But when your collaborator doesn’t do any work, it’s probably better for you to do all the work yourself. You’ll still end up with a completed project."[33]

Example Snowdrift Payouts (A, B)
B cooperates B defects
A cooperates 200, 200 100, 300
A defects 300, 100 0, 0
Example PD Payouts (A, B)
B cooperates B defects
A cooperates 200, 200 -100, 300
A defects 300, -100
0, 0

QMRThe Four Stages of Trust
by Dr. Riki Robbins, Ph.D

Trust evolves. We start off as babies with perfect trust. Inevitably, trust is damaged by our parents or other family members. Depending on the severity, we may experience devastated trust, in which the trust is completely broken. In order to heal, we must learn when and how trust can be restored. As part of this final step, if we cannot fully trust someone. then we establish guarded, conditional, or selective trust.

Perfect Trust
The first people besides ourselves that we learn to trust — or mistrust — are our parents. If they behave with integrity, tell us the truth,and keep their promises, then we are inclined to believe that other people will do the same thing. If our parents tell us to trust them, and then break their word, we may never learn to trust at all.

When Cathy, a college professor, was betrayed, she experienced total mistrust at first. She asked me, "Can I trust anyone: myself, other people, or even God?" I asked her if she remembered feeling this way before. She thought for a moment and then replied, "Yes. When I was a little girl. My father was a minister devoted to spreading the word of God. Yet he beat me and my brother regularly. It seemed so crazy to me. How could someone who was supposed to be so good act so bad? If I couldn't trust him to back up his words with actions, then I couldn't trust anyone else." Since I fully empathized with how Cathy was feeling, it was difficult to disagree with her. But I did tell her that unless she changed her attitude she wouldn't have healthy love relationships in the future.

None of us become adults and retain the perfect trust we were born with. But that doesn't mean we have to go to the opposite extreme. As my good friend author and public speaker Cheewa James puts it, "I trust everybody at the beginning. I assume everyone is loving until proven otherwise". For best results, start off a relationship with the assumption that the other person is trustworthy. Be careful to protect yourself, but give him (or her) the benefit of the doubt.

Damaged Trust
Inevitably, the person you love will violate your trust. The most common warning signs include:

Withholding vital information. You say, "Where were you last night until 2:00 A.M.?" "Nowhere special."

Lying. He says, "I was working late," but when you called his office, there was no answer.

Giving you mixed messages. He denies your accusations but doesn't look you in the eye.

Refusing to negotiate. When you ask, "Will you promise to stay away from her?" he says, "Leave me alone," and walks away.

Deep in your heart you know that trust has been damaged.

When you find out about a betrayal immediately after it happens, trust is broken. But it is not necessarily devastating. Especially if it is a mini-betrayal, you and your partner can talk about the incident, agree that it won't occur again, and reestablish a bond of openness and loyalty.

Devastated Trust
When your partner violates your limits and behaves in a way you find morally unacceptable, your trust is completely broken. Typically this happens after a betrayal when you've been cheated on, lied to, and treated with profound disrespect.

Devastated trust is a crisis. The first time it happens you may totally regress. You feel as if you're five years old as you re-experience your original fundamental loss. You ask yourself, just as Cathy did, "Whom can I trust?" You may answer your own question, "Not my mother or my father, not even my partner. Who's left?" Before you can think about trusting yourself and other people, you have to deal with the situation at hand. Can trust possibly be restored? If not, you will have to end the relationship despite any remaining good qualities.

What happens if you suddenly find out that you've been betrayed long ago? This happened to Edith, a newspaper editor. After her husband, Joe, returned from a weekend personal growth seminar, he decided to "come clean" about his previous sexual infidelities. Late one night, he told Edith that when he had visited an old out-of-town girlfriend five years ago, the two of them had sex. Furthermore, they had both discussed the possibility of ending their marriages so they could have a serious relationship together. "I could never trust Joe again after that," Edith told me. "If he had told me at the time we might have been able to salvage something. But to find out five years later? All this time he'd been withholding vital information. How could I possibly know what else he is hiding now?"

Francesca, a computer technician, was offered a choice. Her husband, George, told her, "During the early years of our marriage I committed a few indiscretions. I'd like to tell you so I can get them off my chest. Is this all right with you?" Francesca thought for a while before she responded, "You can tell me if you like. But if you do I'll never believe another word you say again. The time to tell me was when it happened, not now." Of course, simply by bringing up the subject, he shattered her trust completely.

If you suspect that your partner betrayed you, you should confront him as soon as you can. You may rationalize, "I don't want to hurt him, get into an argument, or rock the boat." Short-term pain is long-term gain. Every moment you wait, trust is eroded. Conversely, if you betray your partner, either reveal it at the time or else take a vow of eternal silence. Sharing a betrayal farther down the road devastates trust.

If trust is repeatedly broken can it be restored? No. Harriet, a registered nurse, had a tumultuous courtship. Her fiancé, Ira, left her to go back to a former girlfriend. When they broke up, he returned to her, promised her an engagement ring, and asked her to marry him. Two weeks later, he spent the weekend with another former girlfriend. Upon his return, he announced that he wanted to postpone their engagement because he wanted to continue dating. Harriet waited patiently until he gave up his second girlfriend. Six months later, she married him. It was a mistake. Harriet said to me, "I actually believed that Ira and I could 'start over'. But it wasn't true. I had lost all respect for him. My trust had been violated so often that I found myself waiting for it to happen again. And Ira continued his habit of having other sexual relationships behind my back. For our relationship to survive it was up to him to take the lead in restoring trust. And he didn't."

Restored Trust
Can you restore sexual or romantic trust once it is damaged or destroyed? It's possible, but difficult. You don't get past a betrayal overnight; it takes months or even years.

The good news is that the aftermath of a betrayal is an opportunity to strengthen your relationship. If you and your partner openly talk about what happened, you will open the gateway to deeper intimacy. While you cannot be positive that you won't be betrayed again you can certainly minimize the chances.

Discuss your partner's motives for betraying you and your own involvement in the cause. Honestly share how you feel, and what you need at the present moment. Express your concerns about the future, let each other know what you expect from now on, and state your limits about what you will and won't put up with. If you can't have this kind of conversation by yourselves, then get professional help right away. Don't wait; mistrust can become a habit. A qualified therapist, psychologist, or marriage counselor can guide the two of you as you explore why the betrayal happened and how to prevent another one. Gradually you'll start trusting each other in small matters — and then in bigger ones.

One thing's for sure: You can't turn back the clock. You and your partner don't feel the same way toward each other anymore. Trust has been broken and it's difficult to fix. As you put your relationship back together, both of you see each other differently. You think, "Maybe I can trust this person again but from now on I need to be careful." Your trust is not as complete as it once was. It may be:

QMRContract design[edit]
Milgrom and Roberts (1992) identify four principles of contract design: When perfect information is not available, Holmström (1979) developed the Informativeness Principle to solve this problem. This essentially states that anymeasure of performance that (on the margin) reveals information about the effort level chosen by the agent should be included in the compensation contract. This includes, for example, Relative Performance Evaluation—measurement relative to other, similar agents, so as to filter out some common background noise factors, such as fluctuations in demand. By removing some exogenous sources of randomness in the agent’s income, a greater proportion of the fluctuation in the agent’s income falls under his control, increasing his ability to bear risk. If taken advantage of, by greater use of piece rates, this should improve incentives. (In terms of the simple linear model below, this means that increasing x produces an increase in b.)

However, setting incentives as intense as possible is not necessarily optimal from the point of view of the employer. The Incentive-Intensity Principle states that the optimal intensity of incentives depends on four factors: the incremental profits created by additional effort, the precision with which the desired activities are assessed, the agent’s risk tolerance, and the agent’s responsiveness to incentives. According to Prendergast (1999, 8), "the primary constraint on [performance-related pay] is that [its] provision imposes additional risk on workers ..." A typical result of the early principal–agent literature was that piece rates tend to 100% (of the compensation package) as the worker becomes more able to handle risk, as this ensures that workers fully internalize the consequences of their costly actions. In incentive terms, where we conceive of workers as self-interested rational individuals who provide costly effort (in the most general sense of the worker’s input to the firm’s production function), the more compensation varies with effort, the better the incentives for the worker to produce.

The third principle—the Monitoring Intensity Principle—is complementary to the second, in that situations in which the optimal intensity of incentives is high corresponds highly to situations in which the optimal level of monitoring is also high. Thus employers effectively choose from a “menu” of monitoring/incentive intensities. This is because monitoring is a costly means of reducing the variance of employee performance, which makes more difference to profits in the kinds of situations where it is also optimal to make incentives intense.

The fourth principle is the Equal Compensation Principle, which essentially states that activities equally valued by the employer should be equally valuable (in terms of compensation, including non-financial aspects such as pleasantness of the workplace) to the employee. This relates to the problem that employees may be engaged in several activities, and if some of these are not monitored or are monitored less heavily, these will be neglected, as activities with higher marginal returns to the employee are favoured. This can be thought of as a kind of "disintermediation"—targeting certain measurable variables may cause others to suffer. For example, teachers being rewarded by test scores of their students are likely to tend more towards teaching 'for the test', and de-emphasise less relevant but perhaps equally or more important aspects of education; while AT&T’s practice at one time of paying programmers by the number of lines of code written resulted in programs that were longer than necessary—i.e., program efficiency suffering (Prendergast 1999, 21). Following Holmström and Milgrom (1990) and Baker (1992), this has become known as "multi-tasking" (where a subset of relevant tasks is rewarded, non-rewarded tasks suffer relative neglect). Because of this, the more difficult it is to completely specify and measure the variables on which reward is to be conditioned, the less likely that performance-related pay will be used: “in essence, complex jobs will typically not be evaluated through explicit contracts.” (Prendergast 1999, 9).

Where explicit measures are used, they are more likely to be some kind of aggregate measure, for example, baseball and American football players are rarely rewarded on the many specific measures available (e.g., number of home runs), but frequently receive bonuses for aggregate performance measures such as Most Valuable Player. The alternative to objective measures is subjective performance evaluation, typically by supervisors. However, there is here a similar effect to "multi-tasking", as workers shift effort from that subset of tasks which they consider useful and constructive, to that subset which they think gives the greatest appearance of being useful and constructive, and more generally to try to curry personal favour with supervisors. (One can interpret this as a destruction of organizational social capital—workers identifying with, and actively working for the benefit of, the firm – in favour of the creation of personal social capital—the individual-level social relations which enable workers to get ahead (“networking”).)

Linear model[edit]
The four principles can be summarized in terms of the simplest (linear) model of incentive compensation:

w = a + b(e + x + gy) \,
where w stands for the wage, e for (unobserved) effort, x for unobserved exogenous effects on outcomes, and y for observed exogenous effects; while g and a represent the weight given to y, and the base salary, respectively. The interpretation of b is as the intensity of incentives provided to the employee.

\text{wage} = (\text{base salary})
+ (\text{incentives})
\cdot (\text{(unobserved) effort} + (\text{(unobserved) effects})
+ (\text{weight }Y)(\text{observed exogenous effects}))
The above discussion on explicit measures assumed that contracts would create the linear incentive structures summarised in the model above. But while the combination of normal errors and the absence of income effects yields linear contracts, many observed contracts are nonlinear. To some extent this is due to income effects as workers rise up a tournament/hierarchy: "Quite simply, it may take more money to induce effort from the rich than from the less well off." (Prendergast 1999, 50). Similarly, the threat of being fired creates a nonlinearity in wages earned versus performance. Moreover, many empirical studies illustrate inefficient behaviour arising from nonlinear objective performance measures, or measures over the course of a long period (e.g., a year), which create nonlinearities in time due to discounting behaviour. This inefficient behaviour arises because incentive structures are varying: for example, when a worker has already exceeded a quota or has no hope of reaching it, versus being close to reaching it—e.g., Healy (1985), Oyer (1997), Leventis (1997). Leventis shows that New York surgeons, penalised for exceeding a certain mortality rate, take less risky cases as they approach the threshold. Courty and Marshke (1997) provide evidence on incentive contracts offered to agencies, which receive bonuses on reaching a quota of graduated trainees within a year. This causes them to ‘rush-graduate’ trainees in order to make the quota.

Options framework[edit]
In certain cases agency problems may be analysed by applying the techniques developed for financial options, as applied via a real options framework.[10][11] Stockholders and bondholders have different objective—for instance, stockholders have an incentive to take riskier projects than bondholders do, and to pay more out in dividends than bondholders would like. At the same time, since equity may be seen as a call option on the value of the firm, an increase in the variance in the firm value, other things remaining equal, will lead to an increase in the value of equity, and stockholders may therefore take risky projects with negative net present values, which while making them better off, may make the bondholders worse off. See Option pricing approaches under Business valuation for further discussion.

Best response correspondence[edit]

Figure 1. Reaction correspondence for player Y in the Stag Hunt game.
Reaction correspondences, also known as best response correspondences, are used in the proof of the existence of mixed strategy Nash equilibria (Fudenberg & Tirole 1991, Section 1.3.B; Osborne & Rubinstein 1994, Section 2.2). Reaction correspondences are not "reaction functions" since functions must only have one value per argument, and many reaction correspondences will be undefined, i.e. a vertical line, for some opponent strategy choice. One constructs a correspondence b(\cdot ), for each player from the set of opponent strategy profiles into the set of the player's strategies. So, for any given set of opponent's strategies \sigma _{-i}, b_{i}(\sigma _{-i}) represents player i 's best responses to \sigma _{-i}.

Figure 2. Reaction correspondence for player X in the Stag Hunt game.
Response correspondences for all 2x2 normal form games can be drawn with a line for each player in a unit square strategy space. Figures 1 to 3 graphs the best response correspondences for the stag hunt game. The dotted line in Figure 1 shows the optimal probability that player Y plays 'Stag' (in the y-axis), as a function of the probability that player X plays Stag (shown in the x-axis). In Figure 2 the dotted line shows the optimal probability that player X plays 'Stag' (shown in the x-axis), as a function of the probability that player Y plays Stag (shown in the y-axis). Note that Figure 2 plots the independent and response variables in the opposite axes to those normally used, so that it may be superimposed onto the previous graph, to show the Nash equilibria at the points where the two player's best responses agree in Figure 3.

There are three distinctive reaction correspondence shapes, one for each of the three types of symmetric 2x2 games: coordination games, discoordination games and games with dominated strategies (the trivial fourth case in which payoffs are always equal for both moves is not really a game theoretical problem). Any payoff symmetric 2x2 game will take one of these three forms.

Coordination games[edit]
Games in which players score highest when both players choose the same strategy, such as the stag hunt and battle of the sexes are called coordination games. These games have reaction correspondences of the same shape as Figure 3, where there is one Nash equilibrium in the bottom left corner, another in the top right, and a mixing Nash somewhere along the diagonal between the other two.

Anti-coordination games[edit]

Figure 3. Reaction correspondence for both players in the Stag Hunt game. Nash equilibria shown with points, where the two player's correspondences agree, i.e. cross
Games such as the game of chicken and hawk-dove game in which players score highest when they choose opposite strategies, i.e., discoordinate, are called anti-coordination games. They have reaction correspondences (Figure 4) that cross in the opposite direction to coordination games, with three Nash equilibria, one in each of the top left and bottom right corners, where one player chooses one strategy, the other player chooses the opposite strategy. The third Nash equilibrium is a mixed strategy which lies along the diagonal from the bottom left to top right corners. If the players do not know which one of them is which, then the mixed Nash is an evolutionarily stable strategy (ESS), as play is confined to the bottom left to top right diagonal line. Otherwise an uncorrelated asymmetry is said to exist, and the corner Nash equilibria are ESSes.

Games with dominated strategies[edit]

Figure 5. Reaction correspondence for a game with a dominated strategy.
Games with dominated strategies have reaction correspondences which only cross at one point, which will be in either the bottom left, or top right corner in payoff symmetric 2x2 games. For instance, in the single-play prisoner's dilemma, the "Cooperate" move is not optimal for any probability of opponent Cooperation. Figure 5 shows the reaction correspondence for such a game, where the dimensions are "Probability play Cooperate", the Nash equilibrium is in the lower left corner where neither player plays Cooperate. If the dimensions were defined as "Probability play Defect", then both players best response curves would be 1 for all opponent strategy probabilities and the reaction correspondences would cross (and form a Nash equilibrium) at the top right corner.

Other (payoff asymmetric) games[edit]
A wider range of reaction correspondences shapes is possible in 2x2 games with payoff asymmetries. For each player there are five possible best response shapes, shown in Figure 6. From left to right these are: dominated strategy (always play 2), dominated strategy (always play 1), rising (play strategy 2 if probability that the other player plays 2 is above threshold), falling (play strategy 1 if probability that the other player plays 2 is above threshold), and indifferent (both strategies play equally well under all conditions).

While there are only four possible types of payoff symmetric 2x2 games (of which one is trivial), the five different best response curves per player allow for a larger number of payoff asymmetric game types. Many of these are not truly different from each other. The dimensions may be redefined (exchange names of strategies 1 and 2) to produce symmetrical games which are logically identical.

Matching pennies[edit]
One well-known game with payoff asymmetries is the matching pennies game. In this game one player, the row player — graphed on the y dimension — wins if the players coordinate (both choose heads or both choose tails) while the other player, the column player — shown in the x-axis — wins if the players discoordinate. Player Y's reaction correspondence is that of a coordination game, while that of player X is a discoordination game. The only Nash equilibrium is the combination of mixed strategies where both players independently choose heads and tails with probability 0.5 each.

Best response dynamics[edit]
In evolutionary game theory, best response dynamics represents a class of strategy updating rules, where players strategies in the next round are determined by their best responses to some subset of the population. Some examples include:

In a large population model, players choose their next action probabilistically based on which strategies are best responses to the population as a whole.
In a spatial model, players choose (in the next round) the action that is the best response to all of their neighbors (Ellison 1993).
Importantly, in these models players only choose the best response on the next round that would give them the highest payoff on the next round. Players do not consider the effect that choosing a strategy on the next round would have on future play in the game. This constraint results in the dynamical rule often being called myopic best response.

In the theory of potential games, best response dynamics refers to a way of finding a Nash equilibrium by computing the best response for every player:

Theorem: In any finite potential game, best response dynamics always converge to a Nash equilibrium. (Nisan et al. 2007, Section 19.3.2)

In game theory, grim trigger (also called the grim strategy or just grim) is a trigger strategy for a repeated game, such as an iterated prisoner's dilemma. Initially, a player using grim trigger will cooperate, but as soon as the opponent defects (thus satisfying the trigger condition), the player using grim trigger will defect for the remainder of the iterated game. Since a single defect by the opponent triggers defection forever, grim trigger is the most strictly unforgiving of strategies in an iterated game.

In iterated prisoner's dilemma strategy competitions, grim trigger performs poorly even without noise, and adding signal errors makes it even worse. Its ability to threaten permanent defection gives it a theoretically effective way to sustain trust, but because of its unforgiving nature and the inability to communicate this threat in advance, it performs poorly.[1]

In Robert Axelrod's book The Evolution of Cooperation, grim trigger is called "Friedman", for a 1971 paper by James Friedman which uses the concept.[2]

QMRThe evolution of cooperation can refer to:

the study of how cooperation can emerge and persist (also known as cooperation theory) as elucidated by application of game theory,
a 1981 paper by political scientist Robert Axelrod and evolutionary biologist W. D. Hamilton (Axelrod & Hamilton 1981) in the scientific literature, or
a 1984 book by Axelrod (Axelrod 1984)[1] that expanded on the paper and popularized the study.
This article is an introduction to how game theory and computer modeling are illuminating certain aspects of moral and political philosophy, particularly the role of individuals in groups, the "biology of selfishness and altruism",[2] and how cooperation can be evolutionarily advantageous.

Axelrod's tournaments[edit]
Axelrod initially solicited strategies from other game theorists to compete in the first tournament. Each strategy was paired with each other strategy for 200 iterations of a Prisoner's Dilemma game, and scored on the total points accumulated through the tournament. The winner was a very simple strategy submitted by Anatol Rapoport called "TIT FOR TAT" (TFT) that cooperates on the first move, and subsequently echoes (reciprocates) what the other player did on the previous move. The results of the first tournament were analyzed and published, and a second tournament held to see if anyone could find a better strategy. TIT FOR TAT won again. Axelrod analyzed the results, and made some interesting discoveries about the nature of cooperation, which he describes in his book[30]

In both actual tournaments and various replays the best performing strategies were nice:[31] that is, they were never the first to defect. Many of the competitors went to great lengths to gain an advantage over the "nice" (and usually simpler) strategies, but to no avail: tricky strategies fighting for a few points generally could not do as well as nice strategies working together. TFT (and other "nice" strategies generally) "won, not by doing better than the other player, but by eliciting cooperation [and] by promoting the mutual interest rather than by exploiting the other's weakness."[32]

Being "nice" can be beneficial, but it can also lead to being suckered. To obtain the benefit – or avoid exploitation – it is necessary to be provocable to both retaliation and forgiveness. When the other player defects, a nice strategy must immediately be provoked into retaliatory defection.[33] The same goes for forgiveness: return to cooperation as soon as the other player does. Overdoing the punishment risks escalation, and can lead to an "unending echo of alternating defections" that depresses the scores of both players.[34]

Most of the games that game theory had heretofore investigated are "zero-sum" – that is, the total rewards are fixed, and a player does well only at the expense of other players. But real life is not zero-sum. Our best prospects are usually in cooperative efforts. In fact, TFT cannot score higher than its partner; at best it can only do "as good as". Yet it won the tournaments by consistently scoring a strong second-place with a variety of partners.[35] Axelrod summarizes this as don't be envious;[36] in other words, don't strive for a payoff greater than the other player's.[37]

In any IPD game there is a certain maximum score each player can get by always cooperating. But some strategies try to find ways of getting a little more with an occasional defection (exploitation). This can work against some strategies that are less provocable or more forgiving than TIT FOR TAT, but generally they do poorly. "A common problem with these rules is that they used complex methods of making inferences about the other player [strategy] – and these inferences were wrong."[38] Against TFT one can do no better than to simply cooperate.[39] Axelrod calls this clarity. Or: don't be too clever.[40]

The success of any strategy depends on the nature of the particular strategies it encounters, which depends on the composition of the overall population. To better model the effects of reproductive success Axelrod also did an "ecological" tournament, where the prevalence of each type of strategy in each round was determined by that strategy's success in the previous round. The competition in each round becomes stronger as weaker performers are reduced and eliminated. The results were amazing: a handful of strategies – all "nice" – came to dominate the field.[41] In a sea of non-nice strategies the "nice" strategies – provided they were also provokable – did well enough with each other to offset the occasional exploitation. As cooperation became general the non-provocable strategies were exploited and eventually eliminated, whereupon the exploitive (non-cooperating) strategies were out-performed by the cooperative strategies.

In summary, success in an evolutionary "game" correlated with the following four characteristics:

Be nice: cooperate, never be the first to defect.
Be provocable: return defection for defection, cooperation for cooperation.
Don't be envious: focus on maximizing your own 'score', as opposed to ensuring your score is higher than your 'partner's'.
Don't be too clever: or, don't try to be tricky.

QMRThe Complexity of Cooperation, by Robert Axelrod, 0691015678 is the sequel to The Evolution of Cooperation. It is a compendium of seven articles that previously appeared in journals on a variety of subjects. The book extends Axelrod's method of applying the results of game theory, in particular that derived from analysis of the Prisoner's Dilemma (IPD) problem, to real world situations.

Prisoner's Dilemma findings[edit]
Axelrod explains the Tit for tat (TFT or T4T) strategy emerged as the most robust option in early IPD tournaments on computer. This strategy combines a willingness to cooperate with a determination to punish non-cooperation. In these articles, however, he shows, that under more complex circumstances, such as the possibility of error, strategies that are a little more cooperative or a little less punitive do even better than TFT. Generous TFT, or GTFT, cooperates a bit more often than TFT, while Contrite TFT or CTFT defects less frequently.

Applications[edit]
Axelrod applied various models related to IPD to a variety of situations, drawing conclusions from these simulations about the ways in which groups form, adhere, oppose or join other groups, and other topics in the fields of genetic evolution, business, political science, military alliances, wars, and more. He has added introductions to these articles explaining what real-world issues drove his research.

Critical response[edit]
Philosopher and political economist Francis Fukuyama, writing for the Foreign Affairs, praises the book for showing that realist models, which assume that in situations lacking a single sovereign actor that anarchy will necessarily result, are too simplistic. Fukuyuma, expresses concern, however, that the game theory approaches aren't sufficiently complex to model real international relations, because they a world with large numbers of simple actors. Fukuyama holds that, instead, the real world consists of a small number of highly complex actors, thus potentially limiting the applicability of Axelrod's analysis.[1]

QMRRobert Harris Frank (born January 2, 1945)[1][2] is the Henrietta Johnson Louis Professor of Management and a Professor of Economics at the Samuel Curtis Johnson Graduate School of Management at Cornell University. He contributes to the "Economic View" column, which appears every fifth Sunday in The New York Times. Frank's 2011 book is on wealth inequality in the United States.[3]

Prisoner's dilemma and cooperation[edit]
Frank, Gilovich, and Regan (1993) conducted an experimental study of the prisoner's dilemma. The subjects were students in their first and final years of undergraduate economics, and undergraduates in other disciplines. Subjects were paired, placed in a typical game scenario, then asked to choose either to "cooperate" or to "defect". Pairs of subjects were told that if they both chose "defect" the payoff for each would be 1. If both cooperated, the payoff for each would be 2. If one defected and the other cooperated, the payoff would be 3 for the defector and 0 for the cooperator. Each subject in a pair made his choice without knowing what the other member of the pair chose.

First year economics students, and students doing disciplines other than economics, overwhelmingly chose to cooperate. But 4th year students in economics tended to not cooperate. Frank et al. concluded, that with "an eye toward both the social good and the well-being of their own students, economists may wish to stress a broader view of human motivation in their teaching."







Formal definition[edit]
The game given in Figure 2 is a coordination game if the following payoff inequalities hold for player 1 (rows): A > B, D > C, and for player 2 (columns): a > b, d > c. The strategy pairs (H, H) and (G, G) are then the only pure Nash equilibria. In addition there is a mixed Nash equilibrium where player 1 plays H with probability p = (d-c)/(a-b-c+d) and G with probability 1–p; player 2 plays H with probability q = (D-C)/(A-B-C+D) and G with probability 1–q.

Strategy pair (H, H) payoff dominates (G, G) if A ≥ D, a ≥ d, and at least one of the two is a strict inequality: A > D or a > d.

Strategy pair (G, G) risk dominates (H, H) if the product of the deviation losses is highest for (G, G) (Harsanyi and Selten, 1988, Lemma 5.4.4). In other words, if the following inequality holds: (C – D)(c – d)≥(B – A)(b – a). If the inequality is strict then (G, G) strictly risk dominates (H, H).2(That is, players have more incentive to deviate).

If the game is symmetric, so if A = a, B = b, etc., the inequality allows for a simple interpretation: We assume the players are unsure about which strategy the opponent will pick and assign probabilities for each strategy. If each player assigns probabilities ½ to H and G each, then (G, G) risk dominates (H, H) if the expected payoff from playing G exceeds the expected payoff from playing H: ½ B + ½ D ≥ ½ A + ½ C, or simply B + D ≥ A + C.

Another way to calculate the risk dominant equilibrium is to calculate the risk factor for all equilibria and to find the equilibrium with the smallest risk factor. To calculate the risk factor in our 2x2 game, consider the expected payoff to a player if they play H: E[\pi_H]=p A + (1-p) C (where p is the probability that the other player will play H), and compare it to the expected payoff if they play G: E[\pi_G]=p B + (1-p) D. The value of p which makes these two expected values equal is the risk factor for the equilibrium (H, H), with 1-p the risk factor for playing (G, G). You can also calculate the risk factor for playing (G, G) by doing the same calculation, but setting p as the probability the other player will play G. An interpretation for p is it is the smallest probability that the opponent must play that strategy such that the person's own payoff from copying the opponent's strategy is greater than if the other strategy was played.

Equilibrium selection[edit]
A number of evolutionary approaches have established that when played in a large population, players might fail to play the payoff dominant equilibrium strategy and instead end up in the payoff dominated, risk dominant equilibrium. Two separate evolutionary models both support the idea that the risk dominant equilibrium is more likely to occur. The first model, based on replicator dynamics, predicts that a population is more likely to adopt the risk dominant equilibrium than the payoff dominant equilibrium. The second model, based on best response strategy revision and mutation, predicts that the risk dominant state is the only stochastically stable equilibrium. Both models assume that multiple two-player games are played in a population of N players. The players are matched randomly with opponents, with each player having equal likelihoods of drawing any of the N−1 other players. The players start with a pure strategy, G or H, and play this strategy against their opponent. In replicator dynamics, the population game is repeated in sequential generations where subpopulations change based on the success of their chosen strategies. In best response, players update their strategies to improve expected payoffs in the subsequent generations. The recognition of Kandori, Mailath & Rob (1993) and Young (1993) was that if the rule to update one's strategy allows for mutation4, and the probability of mutation vanishes, i.e. asymptotically reaches zero over time, the likelihood that the risk dominant equilibrium is reached goes to one, even if it is payoff dominated.3

QMRIn game theory, the traveler's dilemma (sometimes abbreviated TD) is a type of non-zero-sum game in which two players attempt to maximize their own payoff, without any concern for the other player's payoff.

Formulation[edit]
The game was formulated in 1994 by Kaushik Basu and goes as follows:[1][2]

An airline loses two suitcases belonging to two different travelers. Both suitcases happen to be identical and contain identical antiques. An airline manager tasked to settle the claims of both travelers explains that the airline is liable for a maximum of $100 per suitcase—he is unable to find out directly the price of the antiques.

To determine an honest appraised value of the antiques, the manager separates both travelers so they can't confer, and asks them to write down the amount of their value at no less than $2 and no larger than $100. He also tells them that if both write down the same number, he will treat that number as the true dollar value of both suitcases and reimburse both travelers that amount. However, if one writes down a smaller number than the other, this smaller number will be taken as the true dollar value, and both travelers will receive that amount along with a bonus/malus: $2 extra will be paid to the traveler who wrote down the lower value and a $2 deduction will be taken from the person who wrote down the higher amount. The challenge is: what strategy should both travelers follow to decide the value they should write down?

Analysis[edit]
One might expect a traveler's optimum choice to be $100; that is, the traveler values the antiques at the airline manager's maximum allowed price. Remarkably, and, to many, counter-intuitively, the Nash equilibrium solution is in fact just $2; that is, the traveler values the antiques at the airline manager's minimum allowed price.

For an understanding of why $2 is the Nash equilibrium consider the following proof:

Alice, having lost her antiques, is asked their value. Alice's first thought is to quote $100, the maximum permissible value.
On reflection, though, she realizes that her fellow traveler, Bob, might also quote $100. And so Alice changes her mind, and decides to quote $99, which, if Bob quotes $100, will pay $101.
But Bob, being in an identical position to Alice, might also think of quoting $99. And so Alice changes her mind, and decides to quote $98, which, if Bob quotes $99, will pay $100. This is greater than the $99 Alice would receive if both she and Bob quoted $99.
This cycle of thought continues, until Alice finally decides to quote just $2—the minimum permissible price.
Another proof goes as follows:

If Alice only wants to maximize her own payoff, choosing $99 trumps choosing $100. If Bob chooses any dollar value 2–98 inclusive, $99 and $100 give equal payoffs; if Bob chooses $99 or $100, choosing $99 nets Alice an extra dollar.
A similar line of reasoning shows that choosing $98 is always better for Alice than choosing $99. The only situation where choosing $99 would give a higher payoff than choosing $98 is if Bob chooses $100—but if Bob is only seeking to maximize his own profit, he will always choose $99 instead of $100.
This line of reasoning can be applied to all of Alice's whole-dollar options until she finally reaches $2, the lowest price.

Experimental results[edit]
The ($2, $2) outcome in this instance is the Nash equilibrium of the game. By definition this means that if your opponent chooses this Nash equilibrium value then your best choice is that Nash equilibrium value of $2. This will not be the optimum choice if there is a chance of your opponent choosing a higher value than $2.[3] When the game is played experimentally, most participants select a value close to $100.

Furthermore, the travelers are rewarded by deviating strongly from the Nash equilibrium in the game and obtain much higher rewards than would be realized with the purely rational strategy. These experiments (and others, such as focal points) show that the majority of people do not use purely rational strategies, but the strategies they do use are demonstrably optimal. This paradox has led some[weasel words] to question the value of game theory in general, while others have suggested that a new kind of reasoning is required to understand how it can be quite rational ultimately to make non-rational choices. For instance, Capraro has proposed a model where humans do not act a priori as single agents but they forecast how the game would be played if they formed coalitions and then they act so as to maximize the forecast. His model fits the experimental data on the Traveler's dilemma and similar games quite well. [4]

Variation[edit]
One variation of the original traveler's dilemma in which both travelers are offered only two integer choices, $2 or $3, is identical mathematically to the Prisoner's dilemma and thus the traveler's dilemma can be viewed as an extension of prisoner's dilemma. The traveler's dilemma is also related to the game Guess 2/3 of the average in that both involve deep iterative deletion of dominated strategies in order to demonstrate the Nash equilibrium, and that both lead to experimental results that deviate markedly from the game-theoretical predictions.

Payoff matrix[edit]
The canonical payoff matrix is shown below (if only integer inputs are taken into account):

Canonical TD payoff matrix
100 99 98 97 3 2
100 100, 100 97, 101 96, 100 95, 99 1, 5 0, 4
99 101, 97 99, 99 96, 100 95, 99 1, 5 0, 4
98 100, 96 100, 96 98, 98 95, 99 1, 5 0, 4
97 99, 95 99, 95 99, 95 97, 97 1, 5 0, 4
3 5, 1 5, 1 5, 1 5, 1 3, 3 0, 4
2 4, 0 4, 0 4, 0 4, 0 4, 0 2, 2
Denoting by S = \{2,...,100\} the set of strategies available to both players and by F: S \times S \rightarrow \mathbb{R} the payoff function of one of them we can write

F(x,y) = \min(x,y) + 2\cdot\sgn(y-x)
(Note that the other player receives F(y,x) since the game is quantitatively

QMRPayoff matrix[edit]
The canonical payoff matrix is shown below (if only integer inputs are taken into account):

Canonical TD payoff matrix
100 99 98 97 3 2
100 100, 100 97, 101 96, 100 95, 99 1, 5 0, 4
99 101, 97 99, 99 96, 100 95, 99 1, 5 0, 4
98 100, 96 100, 96 98, 98 95, 99 1, 5 0, 4
97 99, 95 99, 95 99, 95 97, 97 1, 5 0, 4
3 5, 1 5, 1 5, 1 5, 1 3, 3 0, 4
2 4, 0 4, 0 4, 0 4, 0 4, 0 2, 2
Denoting by S = \{2,...,100\} the set of strategies available to both players and by F: S \times S \rightarrow \mathbb{R} the payoff function of one of them we can write

F(x,y) = \min(x,y) + 2\cdot\sgn(y-x)
(Note that the other player receives F(y,x) since the game is quantitatively symmetric).

Graphical games[edit]
Say that each player's utility depends only on his own action and the action of one other player - for instance, I depends on II, II on III and III on I. Representing such a game would require only three 2x2 utility tables, containing in all only 12 utility values.
L R T 9 8 B 3 4 l r L 6 8 R 1 3 T B l 4 4 r 5 7
Graphical games are games in which the utilities of each player depends on the actions of very few other players. If d is the greatest number of players by whose actions any single player is affected (that is, it is the indegree of the game graph), the number of utility values needed to describe the game is ns^{d+1}, which, for a small d is a considerable improvement.

It has been shown that any normal form game is reducible to a graphical game with all degrees bounded by three and with two strategies for each player.[3] Unlike normal form games, the problem of finding a pure Nash equilibrium in graphical games (if one exists) is NP-complete.[4] The problem of finding a (possibly mixed) Nash equilibrium in a graphical game is PPAD-complete.[5] Finding a correlated equilibrium of a graphical game can be done in polynomial time, and for a graph with a bounded treewidth, this is also true for finding an optimal correlated equilibrium.[2]

Sparse games[edit]
When most of the utilities are 0, as below, it is easy to come up with a succinct representation.
L, l L, r R, l R, r
T 0, 0, 0 2, 0, 1 0, 0, 0 0, 7, 0
B 0, 0, 0 0, 0, 0 2, 0, 3 0, 0, 0
Sparse games are those where most of the utilities are zero. Graphical games may be seen as a special case of sparse games.

For a two player game, a sparse game may be defined as a game in which each row and column of the two payoff (utility) matrices has at most a constant number of non-zero entries. It has been shown that finding a Nash equilibrium in such a sparse game is PPAD-hard, and that there does not exist a fully polynomial-time approximation scheme unless PPAD is in P.[6]

These games are in the form of quadrant matrices

Symmetric games[edit]
Suppose all three players are identical (we'll color them all purple), and face the strategy set {T,B}. Let #TP and #BP be the number of a player's peers who've chosen T and B, respectively. Describing this game requires only 6 utility values.
#TP=2
#BP=0 #TP=1
#BP=1 #TP=0
#BP=2 T 5 2 2 B 1 7 2
In symmetric games all players are identical, so in evaluating the utility of a combination of strategies, all that matters is how many of the n players play each of the s strategies. Thus, describing such a game requires giving only s\tbinom{n+s-2}{s-1} utility values.

In a symmetric game with 2 strategies there always exists a pure Nash equilibrium – although a symmetric pure Nash equilibrium may not exist.[7] The problem of finding a pure Nash equilibrium in a symmetric game (with possibly more than two players) with a constant number of actions is in AC0; however, when the number of actions grows with the number of players (even linearly) the problem is NP-complete.[8] In any symmetric game there exists a symmetric equilibrium. Given a symmetric game of n players facing k strategies, a symmetric equilibrium may be found in polynomial time if k=O(\log n/\log \log n).[9] Finding a correlated equilibrium in symmetric games may be done in polynomial time.[2]

Anonymous games[edit]
If players were different but did not distinguish between other players we would need to list 18 utility values to represent the game - one table such as that given for "symmetric games" above for each player.
#TP=2
#BP=0 #TP=1
#BP=1 #TP=0
#BP=2 T 8, 8, 2 2, 9, 5 4, 1, 4 B 6, 1, 3 2, 2, 1 7, 0, 6
In anonymous games, players have different utilities but do not distinguish between other players (for instance, having to choose between "go to cinema" and "go to bar" while caring only how crowded will each place be, not who'll they meet there). In such a game a player's utility again depends on how many of his peers choose which strategy, and his own, so sn\tbinom{n+s-2}{s-1} utility values are required.

If the number of actions grows with the number of players, finding a pure Nash equilibrium in an anonymous game is NP-hard.[8] An optimal correlated equilibrium of an anonymous game may be found in polynomial time.[2] When the number of strategies is 2, there is a known PTAS for finding an ε-approximate Nash equilibrium.[10]

Polymatrix games[edit]
If the game in question was a polymatrix game, describing it would require 24 utility values. For simplicity, let us examine only the utilities of player I (we would need two more such tables for each of the other players).
L R T 4, 6 8, 7 B 3, 7 9, 1 l r T 7, 7 1, 6 B 8, 6 6, 4 l r L 2, 9 3, 3 R 2, 4 1, 5
If strategy profile (B,R,l) was chosen, player I's utility would be 9+8=17, player II's utility would be 1+2=3, and player III's utility would be 6+4=10.
In a polymatrix game (also known as a multimatrix game), there is a utility matrix for every pair of players (i,j), denoting a component of player i's utility. Player i's final utility is the sum of all such components. The number of utilities values required to represent such a game is O(n^2*s^2).

Polymatrix games always have at least one mixed Nash equilibrium.[11] The problem of finding a Nash equilibrium in a polymatrix game is PPAD-complete.[5] Finding a correlated equilibrium of a polymatrix game can be done in polynomial time.[2]

Circuit

Circuit games[edit]
Let us now equate the players' various strategies with the Boolean values "0" and "1", and let X stand for player I's choice, Y for player II's choice and Z for player III's choice. Let us assign each player a circuit:
Player I: X ∧ (Y ∨ Z)
Player II: X ⨁ Y ⨁ Z
Player III: X ∨ Y
These describe the utility table below.
0, 0 0, 1 1, 0 1, 1
0 0, 0, 0 0, 1, 0 0, 1, 1 0, 0, 1
1 0, 1, 1 1, 0, 1 1, 0, 1 1, 1, 1
The most flexible of way of representing a succinct game is by representing each player by a polynomial-time bounded Turing machine, which takes as its input the actions of all players and outputs the player's utility. Such a Turing machine is equivalent to a Boolean circuit, and it is this representation, known as circuit games, that we will consider.

Computing the value of a 2-player zero-sum circuit game is an EXP-complete problem,[12] and approximating the value of such a game up to a multiplicative factor is known to be in PSPACE.[13] Determining whether a pure Nash equilibrium exists is a \Sigma_2^{\rm P}-complete problem (see Polynomial hierarchy).[14]

Other representations[edit]
Many other types of succinct game exist (many having to do with allocation of resources). Examples include congestion games, network congestion games, scheduling games, local effect games, facility location games, action-graph games, hypergraphical games and more.

QMRIn game theory, a symmetric equilibrium is an equilibrium where both players use the same strategy (possibly mixed) in the equilibrium. In the Prisoner's Dilemma game pictured to the right, the only Nash equilibrium is (D, D). Since both players use the same strategy, the equilibrium is symmetric.

Symmetric equilibria have important properties. Only symmetric equilibria can be evolutionarily stable states in single population models.

QMRMatching pennies is the name for a simple example game used in game theory. It is the two strategy equivalent of Rock, Paper, Scissors. Matching pennies is used primarily to illustrate the concept of mixed strategies and a mixed strategy Nash equilibrium.

The game is played between two players, Player A and Player B. Each player has a penny and must secretly turn the penny to heads or tails. The players then reveal their choices simultaneously. If the pennies match (both heads or both tails) Player A keeps both pennies, so wins one from Player B (+1 for A, -1 for B). If the pennies do not match (one heads and one tails) Player B keeps both pennies, so receives one from Player A (-1 for A, +1 for B). This is an example of a zero-sum game, where one player's gain is exactly equal to the other player's loss.

The game can be written in a payoff matrix (pictured right). Each cell of the matrix shows the two players' payoffs, with Player A's payoffs listed first.

This game has no pure strategy Nash equilibrium since there is no pure strategy (heads or tails) that is a best response to a best response. In other words, there is no pair of pure strategies such that neither player would want to switch if told what the other would do. Instead, the unique Nash equilibrium of this game is in mixed strategies: each player chooses heads or tails with equal probability.[1] In this way, each player makes the other indifferent between choosing heads or tails, so neither player has an incentive to try another strategy. The best response functions for mixed strategies are depicted on the figure 1 below:

The matching pennies game is mathematically equivalent to the games "Morra" or "odds and evens", where two players simultaneously display one or two fingers, with the winner determined by whether or not the number of fingers match. Again, the only strategy for these games to avoid being exploited is to play the equilibrium.

Of course, human players might not faithfully apply the equilibrium strategy, especially if matching pennies is played repeatedly. In a repeated game, if one is sufficiently adept at psychology, it may be possible to predict the opponent's move and choose accordingly, in the same manner as expert Rock, Paper, Scissors players. In this way, a positive expected payoff might be attainable, whereas when either player plays the equilibrium, everyone's expected payoff is zero.

Nonetheless, statistical analysis of penalty kicks in soccer—a high-stakes real-world situation that closely resembles the matching pennies game—has shown that the decisions of kickers and goalies resemble a mixed strategy equilibrium.[2][3]

QMRIn game theory, a subgame perfect equilibrium (or subgame perfect Nash equilibrium) is a refinement of a Nash equilibrium used in dynamic games. A strategy profile is a subgame perfect equilibrium if it represents a Nash equilibrium of every subgame of the original game. Informally, this means that if (1) the players played any smaller game that consisted of only one part of the larger game and (2) their behavior represents a Nash equilibrium of that smaller game, then their behavior is a subgame perfect equilibrium of the larger game. Every finite extensive game has a subgame perfect equilibrium.[1]

QMRMinimax (sometimes MinMax or MM[1]) is a decision rule used in decision theory, game theory, statistics and philosophy for minimizing the possible loss for a worst case (maximum loss) scenario. Originally formulated for two-player zero-sum game theory, covering both the cases where players take alternate moves and those where they make simultaneous moves, it has also been extended to more complex games and to general decision making in the presence of uncertainty.

QMRMax Weber (1921) presented the idea of instrumental action as the "highest form of rational conduct"(Scott p. 570). The orientation of instrumental action is an ideal type in more than just a methodological sense (Scott). Weber's concept of instrumental action is known as "zweckrational". Zweckrational is just one of Weber's four ideal types of social action, which also includes wertrational (rational action in relation to a value), affective or emotional action, and traditional action. Weber believed that human behavior was increasingly becoming guided more by zweckrational action and less by tradition, values and emotions ([3]).
QMRMax Weber (1864-1920) was a sociologist who was expressing his concern with rationalization. Rationalization is the process whereby an increasing number of social actions and social relationships become based on considerations of efficiency or calculation. Weber believes that there are four ideal types of social actions. Ideal types are used as a tool to look at real cases and compare them to the ideal types to see where they fall. No social action is purely just one of the four types.

Traditional Social Action: actions controlled by traditions, “the way it has always been done”
Affective Social Action: actions determined by one’s specific affections and emotional state, you do not think about the consequences
Value Rational Social Action: actions that are determined by a conscious belief in the inherent value of a type of behavior (ex: religion)
Instrumental-Rational Social Action: actions that are carried out to achieve a certain goal, you do something because it leads to a result

QMRRubinstein bargaining model
From Wikipedia, the free encyclopedia
A Rubinstein bargaining model refers to a class of bargaining games that feature alternating offers through an infinite time horizon. The original proof is due to Ariel Rubinstein in a 1982 paper.[1] For a long time, the solution to this type of game was a mystery; thus, Rubinstein's solution is one of the most influential findings in game theory.

A standard Rubinstein bargaining model has the following elements:

Two players
Complete information
Unlimited offers—the game keeps going until one player accepts an offer
Alternating offers—the first player makes an offer in the first period, if the second player rejects, the game moves to the second period in which the second player makes an offer, if the first rejects, the game moves to the third period, and so forth
Delays are costly
Solution[edit]
Consider the typical Rubinstein bargaining game in which two players decide how to divide a pie of size 1. An offer by a player takes the form x = (x1, x2) with x1 + x2 = 1. Assume the players discount at the geometric rate of d, which can be interpreted as cost of delay or "pie spoiling". That is, 1 step later, the pie is worth d times what it was, for some d with 0<d<1.

Any x can be a Nash equilibrium outcome of this game, resulting from the following strategy profile: Player 1 always proposes x = (x1, x2) and only accepts offers x ' where x1' ≥ x1. Player 2 always proposes x = (x1, x2) and only accepts offers x ' where x2' ≥ x2.

In the above Nash equilibrium, player 2's threat to reject any offer less than x2 is not credible. In the subgame where player 1 did offer x2' where x2 > x2' > d x2, clearly player 2's best response is to accept.

To derive a sufficient condition for subgame perfect equilibrium, let x = (x1, x2) and y = (y1, y2) be two divisions of the pie with the following property:

x2 = d y2, and
y1 = d x1.
Consider the strategy profile where player 1 offers x and accepts no less than y1, and player 2 offer y and accepts no less than x2. Player 2 is now indifferent between accepting and rejecting, therefore the threat to reject lesser offers is now credible. Same applies to a subgame in which it is player 1's turn to decide whether to accept or reject. In this subgame perfect equilibrium, player 1 gets 1/(1+d) while player 2 gets d/(1+d). This subgame perfect equilibrium is essentially unique.

A Generalization[edit]
When the discount factor is different for the two players, d1 for the first one and d2 for the second, let us denote the value for the first player as v(d1,d2). Then a reasoning similar to the above gives

1 − v(d1,d2) = d2 * v(d2,d1)

1 − v(d2,d1) = d1 * v(d1,d2)

yielding v(d1,d2) = (1 − d2) / (1 − d1 d2). This expression reduces to the original one for d1 = d2 = d.

Desirability[edit]
Rubinstein bargaining has become pervasive in the literature because it has many desirable qualities:

It has all the aforementioned requirements, which are thought to accurately simulate real-world bargaining.
There is a unique solution.
The solution is pretty clean, which wasn't necessarily expected given the game is infinite.
There is no delay in the transaction.
As both players become infinitely patient or can make counteroffers increasingly quickly (i.e. as d approaches 1), then both sides get half of the pie.
The result quantifies the advantage of being the first to propose (and thus potentially avoiding the discount).
The generalized result quantifies the advantage of being less pressed for time, i.e. of having a discount factor closer to 1 than that of the other party.

QMRLester Frank Ward (1841-1913), sometimes referred to[by whom?] as the "father" of American sociology, rejected many of Spencer's theories regarding the evolution of societies. Ward, who was also a botanist and a paleontologist, believed that the law of evolution functioned much differently in human societies than it did in the plant and animal kingdoms, and theorized that the "law of nature" had been superseded by the "law of the mind".[12] He stressed that humans, driven by emotions, create goals for themselves and strive to realize them (most effectively with the modern scientific method) whereas there is no such intelligence and awareness guiding the non-human world.[13] Plants and animals adapt to nature; man shapes nature. While Spencer believed that competition and "survival of the fittest" benefited human society and socio-cultural evolution, Ward regarded competition as a destructive force, pointing out that all human institutions, traditions and laws were tools invented by the mind of man and that that mind designed them, like all tools, to "meet and checkmate" the unrestrained competition of natural forces.[12] Ward agreed with Spencer that authoritarian governments repress the talents of the individual, but he believed that modern democratic societies, which minimized the role of religion and maximized that of science, could effectively support the individual in his or her attempt to fully utilize their talents and achieve happiness. He believed that the evolutionary processes have four stages:
First comes cosmogenesis, creation and evolution of the world.
Then, when life arises, there is biogenesis.[13]
Development of humanity leads to anthropogenesis, which is influenced by the human mind.[13]
Finally there arrives sociogenesis, which is the science of shaping the evolutionary process itself to optimize progress, human happiness and individual self-actualization.[13]


QMRThe von Neumann-Morgenstern axioms[edit]
There are four axioms[5] of the expected utility theory that define a rational decision maker. They are completeness, transitivity, independence and continuity.

Completeness assumes that an individual has well defined preferences and can always decide between any two alternatives.

Axiom (Completeness): For every A and B either A \succeq B or A \preceq B.
This means that the individual either prefers A to B, or is indifferent between A and B, or prefers B to A.

Transitivity assumes that, as an individual decides according to the completeness axiom, the individual also decides consistently.

Axiom (Transitivity): For every A, B and C with A \succeq B and B \succeq C we must have A \succeq C.
Independence also pertains to well-defined preferences and assumes that two gambles mixed with a third one maintain the same preference order as when the two are presented independently of the third one. The independence axiom is the most controversial one.

Axiom (Independence): Let A, B, and C be three lotteries with A \succeq B, and let t \in (0, 1]; then tA+(1-t)C \succeq t B+(1-t)C .
Continuity assumes that when there are three lotteries (A, B and C) and the individual prefers A to B and B to C, then there should be a possible combination of A and C in which the individual is then indifferent between this mix and the lottery B.

Axiom (Continuity): Let A, B and C be lotteries with A \succeq B \succeq C; then there exists a probability p such that B is equally good as pA+(1-p)C.
If all these axioms are satisfied, then the individual is said to be rational and the preferences can be represented by a utility function, i.e. one can assign numbers (utilities) to each outcome of the lottery such that choosing the best lottery according to the preference \succeq amounts to choosing the lottery with the highest expected utility. This result is called the von Neumann—Morgenstern utility representation theorem.

In other words: if an individual's behavior always satisfies the above axioms, then there is a utility function such that the individual will choose one gamble over another if and only if the expected utility of one exceeds that of the other. The expected utility of any gamble may be expressed as a linear combination of the utilities of the outcomes, with the weights being the respective probabilities. Utility functions are also normally continuous functions. Such utility functions are also referred to as von Neumann–Morgenstern (vNM) utility functions. This is a central theme of the expected utility hypothesis in which an individual chooses not the highest expected value, but rather the highest expected utility. The expected utility maximizing individual makes decisions rationally based on the axioms of the theory.

The von Neumann–Morgenstern formulation is important in the application of set theory to economics because it was developed shortly after the Hicks-Allen "ordinal revolution" of the 1930s, and it revived the idea of cardinal utility in economic theory.[citation needed] Note, however, that while in this context the utility function is cardinal, in that implied behavior would be altered by a non-linear monotonic transformation of utility, the expected utility function is ordinal because any monotonic increasing transformation of it gives the same behavior.

Theory of Games and Economic Behavior, published in 1944[1] by Princeton University Press, is a book by mathematician John von Neumann and economist Oskar Morgenstern which is considered the groundbreaking text that created the interdisciplinary research field of game theory. In the introduction of its 60th anniversary commemorative edition from the Princeton University Press, the book is described as "the classic work upon which modern-day game theory is based."

QMRIn decision theory, the von Neumann-Morgenstern utility theorem shows that, under certain axioms of rational behavior, a decision-maker faced with risky (probabilistic) outcomes of different choices will behave as if he is maximizing the expected value of some function defined over the potential outcomes. This function is known as the von Neumann-Morgenstern utility function. The theorem is the basis for expected utility theory.

In 1947, John von Neumann and Oskar Morgenstern proved that any individual whose preferences satisfied four [1] axioms has a utility function; such an individual's preferences can be represented on an interval scale and the individual will always prefer actions that maximize expected utility. That is, they proved that an agent is (VNM-)rational if and only if there exists a real-valued function u defined by possible outcomes such that every preference of the agent is characterized by maximizing the expected value of u, which can then be defined as the agent's VNM-utility (it is unique up to adding a constant and multiplying by a positive scalar). No claim is made that the agent has a "conscious desire" to maximize u, only that u exists.

The axioms[edit]
The four axioms of VNM-rationality are then completeness, transitivity, continuity, and independence.

Completeness assumes that an individual has well defined preferences:

Axiom 1 (Completeness) For any lotteries L,M, exactly one of the following holds:
\, L\prec M\, , \, M\prec L\, , or \, L \sim M  
(either M is preferred, L is preferred, or the individual is indifferent[4]).

Transitivity assumes that preference is consistent across any three options:

Axiom 2 (Transitivity) If \,L \preceq M\, and \,M \preceq N\,, then \,L \preceq N\,.
Continuity assumes that there is a "tipping point" between being better than and worse than a given middle option:

Axiom 3 (Continuity): If \,L \preceq M\preceq N\,, then there exists a probability \,p\in[0,1]\, such that
\,pL + (1-p)N\, \sim \,M\,
where the notation on the left side refers to a situation in which L is received with probability p and N is received with probability (1–p).

Instead of continuity, an alternative axiom can be assumed that does not involve a precise equality, called the Archimedean property.[3] It says that any separation in preference can be maintained under a sufficiently small deviation in probabilities:

Axiom 3′ (Archimedean property): If \,L \prec M\prec N\,, then there exists a probability \,\varepsilon\in(0,1) such that
\,(1-\varepsilon)L + \varepsilon N\, \prec \,M \, \prec \,\varepsilon L + (1-\varepsilon)N.\,
Only one of (3) and (3′) need be assumed, and the other will be implied by the theorem.

Independence of irrelevant alternatives assumes that a preference holds independently of the possibility of another outcome:

Axiom 4 (Independence): If \,L\prec M\,, then for any \,N\, and \,p\in(0,1]\,,
\,pL+(1-p)N \prec pM+(1-p)N.\,
The independence axiom implies the axiom on reduction of compound lotteries:[5]

Axiom 4′ (Reduction of compound lotteries): For any Z, \,W, any p, q, r \in (0,1] such that rq=p, and any lottery X=qZ+(1-q)W,
pZ+(1-p)W \sim rX +(1-r)W.

QMRSuppose you have an urn containing 30 red balls and 60 other balls that are either black or yellow. You don't know how many black or how many yellow balls there are, but that the total number of black balls plus the total number of yellow equals 60. The balls are well mixed so that each individual ball is as likely to be drawn as any other. You are now given a choice between two gambles:

Gamble A Gamble B
You receive $100 if you draw a red ball You receive $100 if you draw a black ball
Also you are given the choice between these two gambles (about a different draw from the same urn):

Gamble C Gamble D
You receive $100 if you draw a red or yellow ball You receive $100 if you draw a black or yellow ball
This situation poses both Knightian uncertainty – how many of the non-red balls are yellow and how many are black, which is not quantified – and probability – whether the ball is red or non-red, which is ⅓ vs. ⅔.

The Ellsberg paradox is a paradox in decision theory in which people's choices violate the postulates of subjective expected utility.[1] It is generally taken to be evidence for ambiguity aversion. The paradox was popularized by Daniel Ellsberg, although a version of it was noted considerably earlier by John Maynard Keynes.[2]

No comments:

Post a Comment