9

Click here to load reader

Game Theoretic Approaches for Spectrum Sharing in CR Networks a Survey

Embed Size (px)

Citation preview

Page 1: Game Theoretic Approaches for Spectrum Sharing in CR Networks a Survey

Game Theoretic Approaches for Spectrum Sharingin Cognitive Radio Networks: A Survey

Manoj C∗ and Amrita Mishra†Depatment of Electrical Engineering, Indian Institute of Technology Kanpur

Kanpur, Uttar Pradesh, India.Email: ∗[email protected], †[email protected]

Abstract—Cognitive Radio is gaining popularity and accep-tance all over the world as an efficient way to utilise the limitedwireless spectrum resources. Out of the many design require-ments of Cognitive Radio such as spectrum sensing, spectrumallocation, power control etc., spectrum sharing is one of mainchallenges to ensure peaceful co-existence of the primary andsecondary users. Since multiple users compete to get maximumspectrum resources for themselves, Game Theory is an efficientframework to design robust, stable and scalable spectrum sharingschemes. In this paper, we discuss how the concepts of gametheory can be exploited to design spectrum sharing protocols.Game theoretic models such as Cournot model, Bertrand model,Auction based approach for spectrum sharing, which designthe spectrum sharing scenario from different aspects are alsopresented. Further, the paper throws light on the followingscenario in which users competing for spectrum resources mayhave no incentive to co-operate and may also exchange falseinformation about the channel conditions to gain more access tospectrum. Cheat-proof strategies are thus developed to maintainthe efficiency of spectrum usage. Finally, this paper puts fortha comparative study between the various models along withresearch challenges and future directions in game theoreticmodelling.

Index Terms—Cognitive Radio, Spectrum Sharing, Game The-ory.

I. INTRODUCTION

Cognitive radio is gaining acceptance worldwide as thenext breakthrough technology in wireless services for efficientutilisation of the available spectrum and thus providing fasterand reliable communication [1], [2]. It stands out from theusual wireless networks in a sense that the users have to intelli-gently and dynamically change their operating parameters withchange of their immediate surroundings. This ability of theradio transceiver enables the frequency spectrum to be sharedamong primary (licensed) and secondary (unlicensed) users.Among the various design requirements of a cognitive radiosetup, dynamic spectrum sharing among the various usersforms an important aspect. It requires that the performance ofthe primary user shouldnt degrade due to the opportunistic andselfish behaviour of the malicious users. The final objective isto design a robust spectrum sharing scheme to ensure peacefulfunctioning of the primary as well as the secondary users.

Users in a cognitive radio network are intelligent andobserve, learn to enhance their performance. In traditionalspectrum sharing policies we assume that all users cooperateunconditionally in a static environment. However, in cognitiveradio scenario if the users work towards different goals, e.g.,

to compete for an open unlicensed band, fully cooperativebehaviour cannot be taken for granted. Instead, users will onlycooperate with others if cooperation can bring them morebenefit. The necessity of the users to change and adapt isfurther invigorated due to the various changes in the radioenvironment.

This intelligent behaviour of the users compels researchersto take the help of Game Theory to analyse the cognitive in-teraction processes [3]. Game theory is a mathematical frame-work used to analyse the strategic decisions made by multipledecision makers. Different game models (e.g. non coop-erative/cooperative, static/dynamic, and complete/incompleteinformation games) have been deployed to model and studythe user behaviours in different scenarios [4]. The commonaim of these models is to improve the network performancesuch as throughput maximization, resource consumption, QoS(Quality of Service) given the self-interest of the participatingusers.

There are many advantages of studying cognitive radionetworks in a game theoretic framework. First, by modellingdynamic spectrum sharing among network users (primary andsecondary users) as games, users behaviours and actions canbe analyzed in a formalized game structure and the theoreticaladvancements in game theory can be useful in providing theupper bounds. Second, the optimization of spectrum usageis generally a multi-objective optimization problem, whichis very difficult to analyze and solve. Game theory pro-vides us with well defined equilibrium criteria to measuregame optimality under various game settings. Third, non-cooperative game theory, one of the most important branchesof game theory, enables us to derive efficient approaches fordynamic spectrum sharing using only local information. Suchapproaches become highly desirable when centralized controlis not available [5].

The organisation of the paper is as follows. We will discussthe basic concepts of Game Theory in Section II. We thenpresent the various game theoretic models in spectrum sharingalong with their simulation results: Cournot model, Bertrandmodel, Auction based model and the model incorporatingcheat-proof strategy in Section III, IV, V and VI respectively.Finally we conclude in Section VII by putting forth theresearch challenges and future directions in game theoreticmodelling along with a comparison between the various mod-els for spectrum sharing.

Page 2: Game Theoretic Approaches for Spectrum Sharing in CR Networks a Survey

II. BASICS OF GAME THEORY

Game theory is a bag of analytical tools designed to helpus understand the phenomena that we observe when decision-makers interact [6]. Game theory provides a mathematicalmodel that describes the interaction between various agentswhich tries to maxmize their payoff. A game is a descriptionof strategic interaction that includes the constraints on theactions that the players can take and the players’ interests,but does not specify the actions that the players do take. Thethree basic components in a game are the set of players, setof actions available for each player and set of preferences foreach player.

A. Terminologies

A player is the decision makers in the game. In cognitiveradio scenario, the players are the wireless nodes. Set ofactions are the set of alternatives available to each player.At any instant, the player must choose an element from thesubset of the set of actions. In general, the set of actions canbe different for different players. In cognitive radio, the set ofactions can be the choice of modulation scheme, coding rate,protocol, flow control parameter, transmit power level, or anyother factor that is under the control of the node [7]. Wheneach player chooses an action, the resulting “action profile”determines the outcome of the game. We also assume thatthe player, when presented with any pair of actions, knowswhich of the pair he/she prefers or knows that she regardsboth actions as equally desirable [8]. Usually, preferences aregiven by defining a utility function. A higher value of utilityfunction means that the outcome is more desirable comparedto an outcome with lesser utility function. The utility is alsocalled as pay-off of the user.

Game theory assumes that the players are rational. Thismeans that the player tries to maximize his/her profit irre-spective of what other players are doing.

B. Strategic form game and Nash Equilibrium

A strategic game is a model of interactive decision-makingin which each decision-maker chooses his plan of action onceand for all. Also, these choices are made simultaneously.The Nash equilibrium is a joint strategy where no playercan increase her utility by unilaterally deviating. It is thebest response of a player to the best response of every otherplayers. The best response of a player is the action that giveshim maximum pay-off, given that all other player’s strategiesremain unchanged. Mathematical model of Nash equilibriumis given below.

A game consists of a finite set of players N = 1, 2, ..., N .Each of the players i ∈ N selects a strategy si ∈ Siwith the objective of maximizing the utility ui. The strategyprofile s is the vector containing the strategies of all playerss = (si)i∈N = (s1, s2, ..., sN ). the collective strategies of allplayers except player i is denoted by s–i. Strategy s ∈ S is aNash equilibrium if ui(s) ≥ ui(s

′i, si) ∀s′i ∈ Si ,∀i ∈ N .

Nash equilibrium corresponds to a steady state. If, wheneverthe game is played, the action profile is the same Nash

equilibrium, then no player has a reason to choose any actiondifferent from the current action; there is no incentive for theplayer to change.

C. Cooperative and Non-Cooperative Games

A cooperative game in game theory is one where playersform groups or coalitions and these coalitions enforce coop-erative behaviour. Here, the game is a competition betweencoalitions of players, rather than between individual players. Anon-cooperative game is one in which players make decisionsindependently, without coordination with other players andeach players have their own objectives towards which theymove. In this case, the players see only their own payoff andthey dont consider the payoff others or the whole system.Thus, while they may be able to cooperate, any cooperationmust be self-enforcing.

In a cognitive radio framework, each user usually makesits own decisions (possibly relying also on the informationcollected from other users). These decisions may be dominatedby the rules of the operating protocol, but ultimately each userhas some freedom in setting parameters or changing its ownmode of operation. These users are autonomous agents, takingtheir own decisions about transmit power, packet forwardingetc. The users can exhibit three kinds of behaviour:

1) Users may work towards overall good of the entirenetwork community as whole.

2) In some cases, the same users may behave selfishly,looking out for only their own interests.

3) Finally users may behave maliciously, seeking to ruinnetwork performance of other users.

Game theory can be applied in all the three cases.

III. COURNOT MODEL OF SPECTRUM SHARING

A. What is a Cournot game

Spectrum sharing problem is formulated as an oligopolymarket [9]. An oligopoly market is one in which few firmscompete with each other in terms of amount of commoditysupplied to the market to maximise the profit. In the abovespectrum sharing context, the SUs are analogous to the firmswho compete for the spectrum offered by the PU. The cost ofthe spectrum is determined by using a pricing function by thePU. A Cournot game is used to analyze this situation and theNash equilibrium (NE) is considered as the solution of thisgame. The main objective of this Cournot game formulationis to maximize the profit of all SUs based on the equilibriumadopted by all SUs.

B. System Model

We consider a wireless system with a PU and multipleSUs (i.e., total number of SUs is denoted by N ) who wantto share the spectrum allocated to the PU. The PU sharessome portion of the spectrum bi with secondary user i. Theprimary user charges the secondary user for the spectrum ata rate of c(b) per unit bandwidth, where b is the amount ofavailable bandwidth that can be shared. The SUs transmit inthe allocated spectrum using adaptive modulation to enhance

Page 3: Game Theoretic Approaches for Spectrum Sharing in CR Networks a Survey

their transmission performance. The revenue of secondary useri is denoted by ri per unit of achievable transmission rate. Thespectral efficiency of the transmission for secondary user i canbe obtained from

ki = log2(1 +Kγi) (1)

whereK =

1.5

ln 0.2/BERtari

(2)

We assume that the received SNR information is availableat the transmitting end through channel estimation

C. Spectrum Sharing Scheme

We discuss the static and the dynamic Cournot game. Astatic game model is presented in an ideal case in which allthe SUs can observe the strategies and payoffs of all other SUs.The dynamic case is however a practical version in which theinformation of SUs are not known to a particular SU. The SUobserves change in payoff due to different charging price ofthe PU and adapts its strategy accordingly.

1) Static Cournot Game: The players (i.e., firms in theoligopoly market) in this game are the SUs. The strategy ofeach of the players corresponds to the allocated spectrum size(denoted by bi for SU i) which is non-negative. The payofffor each player is the profit (i.e., revenue-cost) of secondaryuser i (denoted by pi). The pricing function used by the PUfor charging is given by

c(B) = x+ y

∑j

bj

τ

(3)

where x, y, and z are non-negative constants, τ ≥ 1 (sothat this pricing function is convex), and B denotes the set ofstrategies of all secondary users (i.e., B = {b1, . . . bN}). Letw denote the worth of the spectrum for the PU. The conditionc(B) > w×

∑j bj is necessary to ensure that the PU is willing

to share spectrum of size b with the SUs. The PU charges allof the SUs the same price. The revenue of the SU i can beobtained from ri×ki×bi, while the cost of spectrum allocationis bic(b). The profit of the user i can be obtained as

pi(B) = rikibi − bic(B) (4)

We assume that the guard band used to separate the spec-trum allocated to different users is fixed and small. Then, theprofit can be rewritten as follows

pi(B) = rikibi − bi

x+ y

∑j

bj

τ (5)

Let B−i = {b1, . . . , bi1, bi+1, . . . , bN} denote the set ofstrategies adopted by all except secondary user i such thatB = B−i{∪bi}. As the optimal allocated spectrum of one userdepends on the strategies of all other users, NE is consideredto be the solution of the game to ensure that all the secondaryusers are satisfied with the solution. The NE is obtained byusing the best response function which is the best strategy

of one player given others strategies [8]. The best responsefunction of secondary user i given the allocated spectrum sizeof other secondary users bj , where j 6= i, is defined as follows

BRi(B−i) = argmaxbi

pi(B−i ∪ {bi}) (6)

The set B∗ = {b∗1 . . . , b∗N} denotes the Nash equilibrium ofthis game if and only if

b∗i = BRi(B∗−i), ∀ i (7)

where B∗−i denotes the set of best responses of secondary users

j for j 6= i. We formulate an optimization problem with theobjective defined as follows:

Minimize :N∑i=1

|bi −BRi(B−i)| (8)

i.e., we want to minimize the difference between decisionvariables bi and the corresponding best response function.The minimum value of the objective function is zero if thealgorithm reaches the NE.

2) Dynamic Cournot Game: Here the NE of every SUis obtained by interaction with the PU only. Thus, eachSU communicates with the PU to obtain the differentiatedpricing function for different strategies. The adjustment of theallocated spectrum size can be modelled as a repeated Cournotgame as:

bi(t+ 1) = bi(t) + αibi(t)∂pi(B)∂bi(t)

(9)

where bi(t+ 1) is the allocated spectrum size at time t, αi isthe speed adjustment parameter (i.e., learning rate) of SU i.

D. Simulation Results

We consider a cognitive radio environment with one PU andtwo SUs sharing a spectrum of 15 MHz. The target BER forboth the SUs is BERtar

i = 10−4. For the pricing functionof PU, we use x = 0 and y = 1, while τ is adjusted basedon the evaluation scenario (e.g., τ = 1.0 ), and the worth ofspectrum for PU is w = 1. The revenue of a SU per unittransmission rate is ri = 10 ∀i ∈ I . We also assume that theSNR information γi is available to all SUs through channelestimation [10].

Fig. 2 shows the best response of both SU’s in the staticCournot game. The best response of each SU is a linearfunction of the other user’s strategy. The Nash equilibriumis located at the point at which the best responses intersect.We also observe that under different channel qualities, theNash equilibrium is located at the different places. Also, thetrajectory of spectrum sharing in the dynamic Cournot gameis shown for the case of α1 = α2 = 0.14. We again observethat with the same speed adjustment parameter, better channelquality results in more fluctuations in the trajectory to the NE.

Page 4: Game Theoretic Approaches for Spectrum Sharing in CR Networks a Survey

Fig. 1. System Model for Spectrum Sharing.

Fig. 2. Best responses and trajectories of both SU’s to NE.

IV. SPECTRUM SHARING USING AUCTION BASEDAPPROACH

A. System Model

Let us consider a system where there is only one primaryuser (PU) and a group I = (1, ..., I) of secondary users(SUs) who want to share the spectrum allocated to the primaryuser Btot (as shown in Figure 1). The primary user retains agiven amount of bandwidth Brem > Breq where Breq is thebandwidth required to provide a quality of service requirement.The primary user charges secondary users at a price of p perunit bandwidth. After the allocation, the secondary users maytransmit in the allocated spectrum using adaptive modulationto enhance the transmission performance. The revenue of thesecondary user i is denoted by ri per unit of achievabletransmission rate. The spectral efficiency of transmission forthe user i is denoted by ki. We assume that through channel

estimation, the secondary users can obtain the received SNRof the channel. For the secondary user i, given the receivedSNR , targetBERtari (Target BER) and assigned spectrumBi, the transmission rate (in bits per second) can be obtained.

B. Bandwidth Auction

The problem of spectrum sharing is formulated as anauction in which the secondary users (SUs) make bids forthe bandwidth allocated to the primary user (PU). An auctionwith relatively simple rules is proposed below to characterizethe behaviour of interaction between primary user and multiplesecondary users.

1) Information: Each SU i knows its revenue ri per unitof achievable transmission rate, and it also knows its spectralefficiency ki. ri relates to the QoS in a real network The PUannounces a positive reserve bid β > 0 and the price p > 0to all SUs before the auction starts.

2) Bids: The SU i submits a bid bi (0 ≤ bi ≤ Btot) whichgenerally represents the maximum bandwidth that SU desiresfor data transmission.

3) Allocation: The PU allocates bandwidth according to(here we only consider the FDM scheme). The bandwidth onceallocated by the PU there is no contention among the SUs.

Bi =bi∑

j∈Ibj + β

Btot (10)

4) Payments: SU i pays the PU

Ci = pθibi (11)

Where θi is an user dependent priority parameter. We adopt a‘prepay’ mechanism in which the SU pays for the bandwidthit bids instead of that which is assigned by the PU. The prepaymechanism is a crucial part of the auction rules as it preventsthe SU from over-bidding the bandwidth since they pay fortheir own bid. where θi is an user dependent priority parameter.We adopt a ‘prepay’ mechanism in which the SU pays for thebandwidth it bids instead of that which is assigned by the PU.The prepay mechanism is a crucial part of the auction rulesas it prevents the SU from over-bidding the bandwidth sincethey pay for their own bid.

A bidding profile is defined as the vector containing the SUsbids, b = (b1, ..., bI). The bidding profile of SU is opponentsis defined as b−i = (b1, ..., bi1, bi+1, ..., bI), such that b =(bi; b−i). Under the rule of this auction, we notice that bi ∈br

∆= [0, Btot] and the bidding profile b is constrained by

bi ∈ br∆= {b|0 ≤ bi ≤ Btot∀i ∈ I} (12)

In this auction, a positive reserve bid β is used by the PUto control the remaining portion of the spectrum for its ownusage. The PU sets β such that β ≥ Breq is satisfied.

Given the allocated bandwidth, the SU is revenue is givenby

Ri = rikiBi (13)

Page 5: Game Theoretic Approaches for Spectrum Sharing in CR Networks a Survey

The SU i chooses to bid bi which maximises its payoff

Ui(bi;b−i; p) = Ri[Bi(bi;b−i)]Ci(bi, p) (14)

The desirable outcome of an auction is Nash Equilibrium(NE) which is a bidding profile b∗ such that no user wants todeviate from it i.e

Ui(bi∗;b−i∗; p)Ui(bi;b−i∗; p)∀i ∈ I, bi ∈ bR (15)

We define SU is best response as

B(b−i; p) = bi|bi = argmax b ∈ bRUi(bi;b−i; p (16)

This in general could be a set. A NE is also a fixed pointsolution of all the best responses of the SUs. We state certainproperties of NE along with a dynamic updating algorithm toreach the NE in a distributed fashion.

Theorem 1: There are two extreme prices pi and pi definedas

pi =rikiBtot {(I − 1)Btot + β}

θi(IBtot + β)2(17)

pi =rikiBtot

θiβ(18)

If p < pi, all the SUs would bid for the maximum bandwidthallocated to the PU (i.e., bi = Btot ∀ i ∈ I); if p > pi, no SUwould be willing to use any of the spectrum offered by thePU (i.e., bi = 0 ∀ i ∈ I).

Theorem 2: There is a unique NE for the bids of the SU’s. Inaddition, if p ∈ (pi, pi), SU i’s unique best response functionis given as follows:

B(b−i, p) =

√√√√√√rikiBtot

(∑j 6=i

bj + β

)pθi

∑j 6=i

bj + β

Btot

0(19)

where is [x]badefined as

[x]ba = max {min {x, b} , a} (20)

Theorem 3: If the unique NE is interior (Interior NE impliesthat none of the participating users selects a strategy on theboundary of his strategy space), then the bandwidth allocationis fair.

C. Dynamic Updating AlgorithmIn a practical cognitive radio scenario, the SUs may only be

able to observe the pricing and assignment information fromthe primary user (PU), but not the strategies and payoffs ofother secondary users. Hence, we also investigate a distributedalgorithm for each SU to achieve Nash equilibrium based onits interaction with the PU only. Here, each SU communicateswith the PU to obtain the price and different assignmentfunctions for different bids and updates its bid as follows:

bi(t+ 1) = bi(t)+αibi(t)∂Ui(b)∂bi(t)

(21)

Where bi(t+ 1) is the bid in terms of bandwidth at time tand αi is the speed adjustment parameter of the SU i.

Fig. 3. Region of values for stable Nash Equilibrium.

Fig. 4. Nash Equilibrium of bid under different channel equalities.

D. Simulation Results

A cognitive radio environment with one PU and two SUssharing a spectrum of Btot = 10 MHz is considered. Thetarget BER for both the SUs is BERtar i = 10−4. The revenueof a SU per unit transmission rate is ri = 10 ∀ i ∈ I. We alsoassume that the SNR information γi is available to all SUsthrough channel estimation. The PU sets the price p = 10 perunit bandwidth and reserves bid β = 0.2 [11]

In Fig. 3, the regions indicated by arrows are the regions,for which the spectrum sharing is stable and NE would bereached else the sharing would be unstable and fluctuationswould occur.

In Fig. 4, we observe the adaptation of SUs bids underdifferent channel equalities. As expected SU 2 bids morebandwidth and achieves higher revenue when its channelquality becomes better. We also observe the dependence ofchannel quality and bid of one user on the other user.

Page 6: Game Theoretic Approaches for Spectrum Sharing in CR Networks a Survey

V. BERTRAND GAME MODEL

An oligopoly market can also be modeled as a BertrandGame where the firms fix their prices game theoretically. Hereat least two sellers producing homogeneous products competeby setting prices simultaneously; buyers buy everything fromthe firm with lower price. The solution of this Bertrand gameis the Nash Equilibrium. Here we apply Bertrand game modelto the problem of competitive spectrum pricing for dynamicspectrum access [12]. In this model, a few primary servicescompete to offer spectrum to a secondary service.

A. System Model

Consider a wireless environment with N primary servicesoperating on frequency bands Fi and a secondary service witha group of secondary users. The primary service i serving Mi

local connections wants to sell part of the spectrum Fi at pricepi per unit bandwidth to the secondary user. The spectrumdemand depends on the data rate in the spectrum and the pricecharged. The spectral efficiency of transmission by a secondaryuser k is given by

k = log2(1 +Kγ) where K =1.5

ln (0.2/BERtar)(22)

where γ is the SNR at the receiver and BERtar is the targetbit-error-rate.

B. Spectrum Pricing Competition

To quantify the spectrum demand, we consider the quadraticutility function given by

U(b) =N∑i=1

bik(s)i − 1

2

N∑i=1

b2i + 2ν∑i6=

bibj

−N∑i=1

pibi

(23)where b is the set of size of spectrum shared by all primaryservices. i.e., b = {b1, . . . , bi, . . . , bN}, pi is the price offeredby primary service i, k

(s)i denotes the spectral efficiency

of transmission by a secondary user using the spectrum Fi

owned by primary service i. The spectrum substitutabilityparameter ν represents the ability of the secondary user toswitch among the frequencies offered by the primary services.ν = 0 means that the secondary user cannot switch to thatfrequency spectrum while ν = 1 implies that the secondaryuser can switch among the spectra freely.

The spectrum demand function Di(p) of spectrum Fi atsecondary service is obtained by differentiating U(b) withrespect to bi and equate it to zero. It is given by

Di(p) =(k

(s)i − pi)(ν(N − 2) + 1)− ν

∑i 6=j(k

(s)j − pj)

(1− ν) (ν (N − 1) + 1)(24)

The cost function of the primary user is developed byconsidering the degradation of QoS of the primary user. Therevenue function Ri and the cost function Ci are defined as

Ri = c1Mi, Ci(bi) = c2Mi

(Breq

i − k(p)i

Wi − biMi

)2

(25)

where c1 and c2 denote the constant weights for the revenueand cost functions respectively, Breq

i is the bandwidth require-ment of the primary connection, Wi is the size of spectrum, Mi

is the number of primary connections and k(p)i is the spectral

efficiency of wireless transmission for primary service i. Basedon this model, a Bertrand game is formulated as

Players: Primary servicesStrategy: Price per unit spectrum pi (non-negative)Payoff: Profit Pi (Revenue minus cost) realized by sellingthe spectrum to secondary user

Based on the spectrum demand, revenue and cost functions,the profit of each primary firm is given by Pi(p) = bipi+Ri−Ci(bi) where p = {p1, . . . , pi, . . . , pN} is the set of pricesoffered by all players in the game.

NE is obtained by using the fact that its the best strategyof each player, given others’ strategies. The best response ofprimary service i given the prices offered by other primaryservices p−i (pi = p−i ∪ {pi}) is defined as

Bi(p−i) = argmaxpi

Pi(p−i ∪ {pi}) (26)

p∗{p∗1, . . . p∗N} denotes the NE of this game if and only if

p∗i = Bi(p∗−i), ∀i (27)

where p∗−i denotes the best responses of all players expect

player i. We can obtain the NE by solving the equations ∂Pi(p)∂pi

for all i.In cognitive radio situation the primary service will not be

able to observe the profit gained and strategy adopted by otherprimary services. So, it has to decide its strategy from theobserved history. So, we go for a distributed price adjustmentalgorithm which progressively reaches the NE.

Let pi[t] denote the price offered by primary service i attimet. p[t] and p−i are defined similarly. We consider twocases, first in which the strategies of other primary servicesin previous iteration are known to all and the case in whichits not observable. In the first case, the price offered by theprimary service can be obtained from

pi[t+ 1] = Bi(p−i[t]) ∀i (28)

In the second case, the primary service has only local infor-mation and the spectrum demand. Using this, it adjusts itsprize in the direction that maximizes its profit as given by theequation

pi[t+ 1] = pi[t] + αi

(∂Pi(p)∂pi

)(29)

where αi is the learning rate.The first case has no control parameters and it is proved to

be stable [12] by considering the eigen values of the Jacobianmatrix. In the second case, the algorithm can be either stableor unstable depending on the learning rate αi, number of localconnections Mi and spectrum substitutability factor ν.

Page 7: Game Theoretic Approaches for Spectrum Sharing in CR Networks a Survey

C. Inefficiency of Nash Equilibrium

The total profit for all primary services is given by∑Nj=1 Pj(p). The optimal price for all primary services can

be obtained from

∂∑N

j=1 Pj(p)∂pi

= 0 (30)

The optimal values of pi obtained from this equation are dif-ferent than those from NE. So, primary services may cooperateto achieve higher profit. In a repeated game, the game isplayed multiple times and the users can observe the outcomeof previous games. So, they will learn to cooperate. Since thisoptimal price is not the NE, some primary users may deviateunilaterally to increase their own profit. So, the optimumpricing is not a stable equilibrium. As the optimal pricing isdesirable, it can be achieved by using a punishment mechanismthat punishes any user that deviate from the optimal price.When a user deviates form the optimal pricing,the punishmentaction is triggered and all the users switch to the NE state, fromwhich no player will deviate. We consider a trigger strategyin which any primary service maintains the collusion as longas other services agree to do so. But, if a primary servicedeviates, a punishment action is triggered

A primary service usually gives a smaller weight to theprofit in the future stages than the profit in the current stage.If the current profit is Pi, the profit in the next stage is ofworth δiPi where δi is the weight. Let Po

i , Pni and Pd

i denotethe profits of primary service i following optimal price, profitby following price at NE and profit of deviating respectively.The collusion will be maintained if the long-term profit byadopting collusion is higher than that obtained by deviation.Mathematically [12],

1

1− δiPoi ≥ Pd

i +δi

1− δiPni (31)

A lower bound on δi can be obtained from this as

δi ≥Pdi − Po

i

Pdi −Pn

i

(32)

Collusion will be maintained only where δi satisfies (32).

D. Performance Evaluation

We consider a cognitive radio environment with two primaryservices and a secondary service. 20 MHz of frequencyspectrum available to each primary service. The number oflocal connections at each primary service is 10. The targetBER of secondary service is BERtar = 10−4. The bandwidthrequirement of the connections at each primary service is 2Mbps ( Breq = 2), and c1 = c2 = 2. The channel qualityfor the secondary service varies between 9 to 22 dB. Thespectrum substitutability factor lies between 0.1 to 0.6. Forthe dynamic price adaptation algorithms, the initial prices areset as p1[0] = p2[0] = 1.

If the primary services can observe each others strategies(Case 1), the price converges to equilibrium price in a few it-erations. But, if only the spectrum demand from the secondary

Fig. 5. Profit of each primary service at equilibrium under different channelqualities of frequency spectrum offered by primary service one.

service is observable (Case 2), and the price is adjusted basedon this information, the speed of convergence depends on thelearning rate α. An optimum learning rate makes the algorithmconverges as fast as that of case 1. But, a larger learning ratecauses fluctuations in the price adaptation and the algorithmrequires large number of iterations to converge.

The profit of both the primary services at the Nash equilib-rium is shown in Fig. 5. When the channel quality of spectrumoffered by primary service one is better, the spectrum demandbecomes higher. So, primary service one can increase the priceas well as the size of the offered spectrum share to gain higherrevenue. When primary service one gains a higher profit due tolarger demand, primary service two gains only a lower profitdue to smaller demand. The spectrum substitutability factor νalso impacts the prices due to the different channel qualities.Alarger ν only slightly affects the price offered by primaryservice one, the price offered by primary service two decreasesat a higher rate for a larger value of ν. A smaller value of νlowers the price offered by primary service one, the rate ofdecrease in the price offered by service two has to be higherto attract the secondary service. This is required to achieve thehighest profit given the channel qualities corresponding to thespectrum offered by primary service one.

VI. CHEAT - PROOF STRATEGIES

In cognitive radio environment, users competing for theopen spectrum may have no incentive to cooperate with eachother. They may even exchange false private information likechannel conditions to get more access to the spectrum. So,cheat-proof spectrum sharing schemes should be developedto maintain the efficiency of the spectrum usage. So we usemechanism design theory to make and provide incentives forplayers to be honest [13]. We also make cheating unprofitableby statistical approaches.

A. System Model

We consider a situation where K pairs of unlicensed userscoexist in the same area and compete for an unlicensedspectrum band. The users trying to communicate with theirpair cause interference to other pairs. At time slot n, all pairs

Page 8: Game Theoretic Approaches for Spectrum Sharing in CR Networks a Survey

try to occupy the spectrum and the received signal at the i-threceiver yi[n] is be expressed as

yi[n] =

K∑j=1

hji[n]xj [n] + wi[n], i = 1, 2, . . . ,K (33)

where xj [n] is the transmitted information on j-th pair,hji[n](j = 1, 2, . . . ,K; i = 1, 2, . . . ,K) represents the chan-nel gain from j-th transmitter to the i-th receiver and wi[n]is the white noise at i-th receiver. The transmission power ofi-th user bounded by PM

i i.e., |xi[n]|2 ≤ PMi at all n.

Each user is selfish and they try to maximize their ownprofit. This spectrum sharing game can be modelled as:

Players: K transmitter-receiver pairsStrategy: Transmission power of each user pi in [0, PM

i ]Payoff: Ri(p1, p2, . . . , pK), the gain of transmissionachieved by i-th player after the players have chosen thepower levels p1, p2, . . . , pK .

The averaged payoff of i-th player is given by

Ri(p1, p2, . . . , pK) = log2

(1 +

pi|hii|2

N0 +∑

j 6=i pj |hji|2

).

(34)First,we consider a one-shot game in which the players

consider only the current payoff. Its proved in [13] that theonly Nash equilibrium of this game is (PM

1 , PM2 , . . . , PM

K ).The payoff at NE is given by

RSi (h1i, h2i, . . . , hKi) = log2

(1 +

PMi |hii|2

N0 +∑

j 6=i PMj |hji|2

)(35)

The superscript ‘S’ stands for selfish. This is the only possibleoutcome of a one-shot game. All users transmitting at maxi-mum power causes strong mutual interference to all users.

Spectrum sharing lasts over a long period of time. So,everyone will be better off if they take turns and transmit.Such a cooperation must be self-enforced. Now we considera repeated game which lasts over several turns. The playersview these rounds as a whole. The payoff is given by

Ui = (1− δ)+∞∑n=0

δnRi[n] (36)

where Ri[n] is the payoff of player i at time slot n. δ is thediscount factor as defined earlier.

If all the players follow some predetermined rules to sharethe spectrum, higher expected one-slot payoff RC

i (‘C’ standsfor cooperation, RC

i > RSi ∀i = 1, 2, . . .K) can be achieved.

But, selfish players can take advantage of others by trans-mitting in the time slots not allotted to them. This gives apayoff RD

i (‘D’ stands for deviation). Cooperation is not astable equilibrium in the one-shot game,but it can be enforcedin a repeated game by the threat of punishment. We denotethe discounted payoff with deviation as UD

i and that withoutdeviation as UC

i . As δ → 1, UDi converges to RS

i almostsurely and UC

i converges to RCi almost surely.

Hence, cooperation exists only if UCi (= rCi ) > UD

i (= rSi ).i.e., all players are self-enforced to cooperate because ofpunishment after deviation. We use a “punish-and-forgive”strategy where the punishment state stays only for T − 1time slots and cooperation resumes from T -th time slot. Theparameter T can be determined by analyzing the incentiveof the players. If the tendency to deviate is stronger, thepunishment should also be harsher to prevent deviation.

B. Cooperation with Optimal Detection

We assume that there is a common control channel overwhich players can exchange information. Based on the infor-mation transmitted, the players decide who should transmit ata given slot. Only one player transmits at a time. Each slot isdivided into three phases: in first phase, each player exchangechannel information with others; in second phase, playersdecide whether to access the spectrum or not, according tocooperation rule; in third slot, the eligible player transmitsdata. During the third phase, the eligible player pauses trans-mission and ‘listens’ to the channel for some time to catch thedeviators. If the player finds any other player deviating, thesystem is alerted into punishment mode.

We consider two cooperation rules: maximum total through-put criterion (MTT) which maximizes the sum of individualpayoffs and approximate proportional fairness (APF) crite-rion that maximizes their product. Punishment-based spectrumsharing game provides incentive for players to be honest, asdeviation is deterred by the threat of punishment. Detection ofthe deviating behavior is necessary for threat to be credible.

C. Cheat-Proof Strategies

The repeated game discussed above inherently assumes thatcomplete and perfect information is available. But, informationlike the power constraints and channel gains are privateinformation of player. So, selfish players may provide falseinformation to get a higher payoff. Therefore, enforcing truth-telling is a crucial problem.

1) Mechanism-design-based strategy: Mechanism designprovides incentives for players to be honest. The playersclaiming high values are asked to pay a tax and the amountof the tax will increase as the claimed value increases. Somemonetary compensation is given to the players reporting lowvalues. Now, the spectrum sharing game becomes a new gamewith original payoffs replaced by the overall payoffs whichincludes the monetary transfers. The transfer function can bedesigned such that the players get the highest payoff onlywhen they claim their true private values. With this transferfunctions, all players’ payment/income adds up to 0 at anytime slot. It means that the monetary transfer is exchangedonly within the community of cooperative players at any time.This property is suitable for open spectrum sharing scenario.

2) Statistics-based strategy: For the APF rule, every playerreports the normalized channel gain and the player with thehighest reported value geta access to spectrum. The normalizedgains are exponentially distributed with mean 1. So, in the longrun, each player will have access to the spectrum 1/K of total

Page 9: Game Theoretic Approaches for Spectrum Sharing in CR Networks a Survey

Fig. 6. The payoffs under a heterogeneous setting with different cooperationrules.

slots. If player i occupies the spectrum more than (1/K + ε)of the total time, where ε is a pre-determined threshold, it ishighly possible that the player has cheated. If a player is foundto transmit for more than (1/K + ε) of slots, that player willbe marked as cheater and get punished. In this way, the profitof cheating is greatly limited.

D. Simulation Results

We consider a scenario with two players (K = 2) with samemaximum power constraint and same relative interference γ.The players can gain more by cooperating than by being self-ish. But, cooperation is unnecessary in cases when interferenceis very less (γ ≈ 0). It was observed that payoff of cooperationis higher than that of non cooperation for γ > 0.15.

Now we consider a heterogeneous environment whereplayers have different power constraints. We fix the powerconstraint of player 1, and increase the power constraint ofplayer 2, PM

2 . The payoffs with the MTT and APF rulesare demonstrated in Fig. 6, where ‘1’ and ‘2’ refer to thepayoffs of player 1 and player 2, respectively. The payoffswithout cooperation and payoffs using the max-min fairnesscriterion (denoted by “NOC” and “MMF” respectively) arealso shown for comparison. It can be seen from the figure thatboth MTT and APF rules outperform the non-cooperation case.This means players have the incentive to cooperate in bothrules. It was also seen that the payoff is maximized only ifthe player honestly claims his/her true information. Therefore,players are self enforced to tell the truth with this mechanism.

VII. CONCLUSION

Although Game Theory has been extensively used in mod-elling the interactions between the users in a cognitive radionetwork, yet it faces certain challenges too. Choosing a properpay-off function needn’t always result in a simple analysis forthe game theoretic model. As cognitive radio networks benefit

from technology evolution, the same technologies can alsobe used by malicious users to launch more complicated andunpredictable attacks. It is therefore wise to use the frameworkof game theory judiciously.

Dynamic spectrum sharing is one of the key functions ofcognitive radio networks. In this paper, we initially discussedthe basics of game theory .Then we presented and elaboratedon the various game theoretic models namely Cournot, Auc-tion based, Bertrand and Cheat-proof which can be appliedto the spectrum sharing scenario. Each model has been dealtseparately giving an extensive knowledge of how the problemis formulated, what are the governing conditions imposed andultimately the equilibrium attained. Cournot model is the mostprimitive form of modelling a spectrum sharing problem whichconcludes that NE is the most desirable solution. Auctionbased model is a novel way of modelling the spectrumsharing problem with auction theory background. Bertrandgame moves a step ahead and shows the inefficiency of NEand how to improve upon it. Cheat proof strategies throwsinsight into mechanism design which seeks the players to behonest. We have exhaustively analysed and presented existinggame theoretic models in the spectrum sharing scenario.

REFERENCES

[1] S. Haykin. Cognitive radio: brain-empowered wireless communications.Selected Areas in Communications, IEEE Journal on, 23(2):201 – 220,Feb. 2005.

[2] Ian F. Akyildiz, Won-Yeol Lee, Mehmet C. Vuran, and ShantidevMohanty. Next generation/dynamic spectrum access/cognitive radiowireless networks: A survey. Computer Networks, 50(13):2127 – 2159,2006.

[3] Magnus M. Halldorsson, Joseph Y. Halpern, Li (Erran) Li, and Vahab S.Mirrokni. On spectrum sharing games. In Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing,PODC ’04, pages 107–114, New York, NY, USA, 2004. ACM.

[4] Jane Wei Huang and Vikram Krishnamurthy. Game theoretic issues incognitive radio systems (invited paper). Journal of Communications,4(10), November 2009.

[5] Beibei Wang, Yongle Wu, and K.J. Ray Liu. Game theory for cognitiveradio networks: An overview. Computer Networks, 54(14):2537 – 2561,2010.

[6] Martin J. Osborne and Ariel Rubinstein. A course in game theory. MITPress, 1994.

[7] Allen B. MacKenzie. Game Theory for Wireless Engineers. Morgan &Claypool PublishersPress, 2006.

[8] Martin J. Osborne. An introduction to game theory. Oxford UniversityPress, 2003.

[9] Li Yan-bin, Wang Li-feng, and Li Ying. An improved game-theoreticspectrum sharing algorithm in cognitive radio networks. In ComputerResearch and Development (ICCRD), 2011 3rd International Conferenceon, volume 2, pages 499 –503, March 2011.

[10] D. Niyato and E. Hossain. A game-theoretic approach to competitivespectrum sharing in cognitive radio networks. In Wireless Communica-tions and Networking Conference, 2007.WCNC 2007. IEEE, pages 16–20, March 2007.

[11] Xinbing Wang, Zheng Li, Pengchao Xu, Youyun Xu, Xinbo Gao, andHsiao-Hwa Chen. Spectrum sharing in cognitive radio networks –an auction-based approach. Systems, Man, and Cybernetics, Part B:Cybernetics, IEEE Transactions on, 40(3):587 –596, June 2010.

[12] D. Niyato and E. Hossain. Competitive pricing for spectrum sharing incognitive radio networks: Dynamic game, inefficiency of nash equilib-rium, and collusion. Selected Areas in Communications, IEEE Journalon, 26(1):192 –202, Jan. 2008.

[13] Yongle Wu, B. Wang, K.J.R. Liu, and T.C. Clancy. Repeated openspectrum sharing game with cheat-proof strategies. Wireless Communi-cations, IEEE Transactions on, 8(4):1922 –1933, April 2009.