17
0018-9545 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2017.2748570, IEEE Transactions on Vehicular Technology LOAD BALANCING SCHEME WITH SMALL-CELL COOPERATION FOR CLUSTERED HETEROGENEOUS CELLULAR NETWORKS, J.-B. PARK AND K. S. KIM 1 Load Balancing Scheme with Small-Cell Cooperation for Clustered Heterogeneous Cellular Networks Jin-Bae Park Student Member, IEEE, and Kwang Soon Kim , Senior Member, IEEE Abstract—In this paper, a joint user association (UA) scheme with JP-CoMP using a hybrid self-organizing network (SON) is proposed for a practical clustered heterogeneous cellular network (cHCN) to maximize the network-wide proportional fairness among users. The cell range expansion (CRE) and the enhanced inter-cell interference coordination (e-ICIC) have been considered as key items in the long term evolution-advanced (LTE-A) to offload macro-cell users to small-cell base stations (sBSs). However, in a cHCN where sBSs are not distributed at random but are clustered instead, the coverage of inner sBSs in a small-cell cluster would be hardly expanded and an increased bias may result in much poor link quality as well as much higher load in outer sBSs. Thus, the load balancing capability becomes much lower than expected in a cHCN. In order to cope with such a problem, a network architecture and protocol for the cHCN is suggested and a feasible suboptimal iterative algorithm for determining the joint UA solution of the proposed hybrid SON is provided. It is shown that the proposed hybrid SON scheme with the proposed joint UA solution is very effective in handling the load balancing in a practical cHCN not only improving the performance of the inner sBS users by reducing the inter-cell interference, especially for intra-tier offloaded users, but also enabling more aggressive inter-tier offloading by effectively improving the link quality of cluster edge users without causing an unnecessary resource waste. Index Terms—Clustered HCN, CoMP, Load Balancing. I. I NTRODUCTION In current cellular networks, conventional macro-cells are being overlaid with low-powered small-cells as a way to enhance the spectral efficiency per area [1]. Especially in realistic scenarios such as those based on real measurement data [2][3] and those in the technical report by 3GPP [4], more than one small-cell BSs (sBSs) might be installed in hotspot areas [5][6], which can be modeled as a clustered HCN (cHCN) where sBSs and users are clustered in hotspot areas [7][8]. In an HCN or a cHCN which can be considered as an HCN with dense small-cells in some locations, one of the most important challenges for achieving the full potential of J.-B. Park and K. S. Kim are with the Department of Electrical and Electronic Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, Korea. : Corresponding author ([email protected]) This research was supported by Institute for Information & communications Technology Promotion(IITP) grant funded by the Korea government(MSIT) (2014-0-00552, Next Generation WLAN System with High Efficient Perfor- mance and 2016-0-00208, High Accurate Positioning Enabled MIMO Trans- mission and Network Technologies for Next 5G-V2X(vehicle-to-everything) Services) and Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF-2014R1A2A2A01007254). a small-cell deployment is how to balance the loads among cells [9]-[11]. There have been many efforts on developing effective load balancing schemes in an HCN and they can be categorized into distributed load balancing schemes [5][12]-[18] and joint load balancing schemes [19]-[26]. Among distributed load balancing schemes, a cell range expansion (CRE) scheme using a fixed bias [12]-[15] combined with an enhanced inter- cell interference coordination (eICIC) scheme using almost blank subframes (ABS) [5][16]-[18] has been adopted in the 3GPP long term evolution-advanced (LTE-A) as a user association (UA) scheme with resource partitioning (RP) to efficiently offload more traffic from macro cells to small-cells. When sBSs are uniformly distributed, the above distributed UA with an RP scheme allows small-cells to be expanded while maintaining overall link-quality of the small-cell users compared to the maximum signal-to-noise power ratio (SNR) approach where each user is associated to the cell with the strongest SNR [27][28]. However, the performance might not be optimal for a given user and BS geometry due to the dis- tributed nature and the use of the common bias and ABS ratio values. On the other hand, the joint load balancing schemes have been proposed to further improve a network-wide utility such as the network-wide proportional fairness [19]-[24], the network capacity [25] or the minimum throughput [26]. Since such a joint UA with RP scheme as in [22]-[24] can jointly determine the UA and the ABS ratio of each macro-cell for a given user and BS geometry, lightly-loaded sBSs can be expanded more aggressively so that more macro-cell users as well as small-cell users in heavily-loaded sBSs can be off- loaded without deteriorating the overall link-quality, which results in better network utility [22]. However, in a cHCN where sBSs are not distributed at random but are clustered instead, the coverage of inner sBSs in a small-cell cluster would be hardly expanded by the distributed approach and an increased bias for offloading as many macro-cell users to the small-cell layer as in an HCN results in much poor link quality as well as much higher load in outer sBSs. Thus, the load balancing capability of a distributed scheme becomes much lower than expected in a cHCN. Although the former is hardly mitigated, the latter can be alleviated to some extent by using a joint approach. In this approach, some portion of the load of outer sBSs can be offloaded to lightly-loaded inner sBSs. However, the impact is limited due to the poor link quality of such offloaded users. In order to improve the poor link quality of offloaded users, Copyright (c) 2017 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to [email protected].

LOAD BALANCING SCHEME WITH SMALL-CELL …dcl.yonsei.ac.kr/wordpress/wp-content/uploads/publi/paper/TVT... · considered as key items in the long term evolution-advanced (LTE-A)

Embed Size (px)

Citation preview

0018-9545 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2017.2748570, IEEETransactions on Vehicular Technology

LOAD BALANCING SCHEME WITH SMALL-CELL COOPERATION FOR CLUSTERED HETEROGENEOUS CELLULAR NETWORKS, J.-B. PARK AND K. S. KIM 1

Load Balancing Scheme with Small-CellCooperation for Clustered Heterogeneous Cellular

NetworksJin-Bae Park Student Member, IEEE, and Kwang Soon Kim†, Senior Member, IEEE

Abstract—In this paper, a joint user association (UA) schemewith JP-CoMP using a hybrid self-organizing network (SON)is proposed for a practical clustered heterogeneous cellularnetwork (cHCN) to maximize the network-wide proportionalfairness among users. The cell range expansion (CRE) and theenhanced inter-cell interference coordination (e-ICIC) have beenconsidered as key items in the long term evolution-advanced(LTE-A) to offload macro-cell users to small-cell base stations(sBSs). However, in a cHCN where sBSs are not distributed atrandom but are clustered instead, the coverage of inner sBSs ina small-cell cluster would be hardly expanded and an increasedbias may result in much poor link quality as well as much higherload in outer sBSs. Thus, the load balancing capability becomesmuch lower than expected in a cHCN. In order to cope withsuch a problem, a network architecture and protocol for thecHCN is suggested and a feasible suboptimal iterative algorithmfor determining the joint UA solution of the proposed hybridSON is provided. It is shown that the proposed hybrid SONscheme with the proposed joint UA solution is very effectivein handling the load balancing in a practical cHCN not onlyimproving the performance of the inner sBS users by reducing theinter-cell interference, especially for intra-tier offloaded users, butalso enabling more aggressive inter-tier offloading by effectivelyimproving the link quality of cluster edge users without causingan unnecessary resource waste.

Index Terms—Clustered HCN, CoMP, Load Balancing.

I. INTRODUCTION

In current cellular networks, conventional macro-cells arebeing overlaid with low-powered small-cells as a way toenhance the spectral efficiency per area [1]. Especially inrealistic scenarios such as those based on real measurementdata [2][3] and those in the technical report by 3GPP [4],more than one small-cell BSs (sBSs) might be installed inhotspot areas [5][6], which can be modeled as a clustered HCN(cHCN) where sBSs and users are clustered in hotspot areas[7][8]. In an HCN or a cHCN which can be considered asan HCN with dense small-cells in some locations, one of themost important challenges for achieving the full potential of

J.-B. Park and K. S. Kim are with the Department of Electrical andElectronic Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu,Seoul 03722, Korea.† : Corresponding author ([email protected])This research was supported by Institute for Information & communications

Technology Promotion(IITP) grant funded by the Korea government(MSIT)(2014-0-00552, Next Generation WLAN System with High Efficient Perfor-mance and 2016-0-00208, High Accurate Positioning Enabled MIMO Trans-mission and Network Technologies for Next 5G-V2X(vehicle-to-everything)Services) and Basic Science Research Program through the National ResearchFoundation of Korea (NRF) funded by the Ministry of Education, Science andTechnology (NRF-2014R1A2A2A01007254).

a small-cell deployment is how to balance the loads amongcells [9]-[11].

There have been many efforts on developing effective loadbalancing schemes in an HCN and they can be categorizedinto distributed load balancing schemes [5][12]-[18] and jointload balancing schemes [19]-[26]. Among distributed loadbalancing schemes, a cell range expansion (CRE) schemeusing a fixed bias [12]-[15] combined with an enhanced inter-cell interference coordination (eICIC) scheme using almostblank subframes (ABS) [5][16]-[18] has been adopted inthe 3GPP long term evolution-advanced (LTE-A) as a userassociation (UA) scheme with resource partitioning (RP) toefficiently offload more traffic from macro cells to small-cells.When sBSs are uniformly distributed, the above distributedUA with an RP scheme allows small-cells to be expandedwhile maintaining overall link-quality of the small-cell userscompared to the maximum signal-to-noise power ratio (SNR)approach where each user is associated to the cell with thestrongest SNR [27][28]. However, the performance might notbe optimal for a given user and BS geometry due to the dis-tributed nature and the use of the common bias and ABS ratiovalues. On the other hand, the joint load balancing schemeshave been proposed to further improve a network-wide utilitysuch as the network-wide proportional fairness [19]-[24], thenetwork capacity [25] or the minimum throughput [26]. Sincesuch a joint UA with RP scheme as in [22]-[24] can jointlydetermine the UA and the ABS ratio of each macro-cell fora given user and BS geometry, lightly-loaded sBSs can beexpanded more aggressively so that more macro-cell users aswell as small-cell users in heavily-loaded sBSs can be off-loaded without deteriorating the overall link-quality, whichresults in better network utility [22].

However, in a cHCN where sBSs are not distributed atrandom but are clustered instead, the coverage of inner sBSsin a small-cell cluster would be hardly expanded by thedistributed approach and an increased bias for offloading asmany macro-cell users to the small-cell layer as in an HCNresults in much poor link quality as well as much higherload in outer sBSs. Thus, the load balancing capability ofa distributed scheme becomes much lower than expected ina cHCN. Although the former is hardly mitigated, the lattercan be alleviated to some extent by using a joint approach. Inthis approach, some portion of the load of outer sBSs can beoffloaded to lightly-loaded inner sBSs. However, the impact islimited due to the poor link quality of such offloaded users.

In order to improve the poor link quality of offloaded users,

Copyright (c) 2017 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained fromthe IEEE by sending a request to [email protected].

0018-9545 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2017.2748570, IEEETransactions on Vehicular Technology

LOAD BALANCING SCHEME WITH SMALL-CELL COOPERATION FOR CLUSTERED HETEROGENEOUS CELLULAR NETWORKS, J.-B. PARK AND K. S. KIM 2

using a joint processing coordinated multi-cell processing (JP-CoMP) such as multi-cell zero-forcing beamforming (ZFBF)[29] or maximum ratio transmission (MRT) [30] within asmall-cell cluster is very promising. This is because a cellcooperation using JP-CoMP may improve the link quality bymitigating inter-cell interference among small-cells or increasethe effective signal power from a small-cell cluster by applyingthe ZFBF and MRT schemes selectively for each user. Here,the adaptive CoMP selection can be realized by the jointload balancing scheme such as in [19][22] in conjunctionwith an RP scheme such as the e-ICIC. However, althoughmBSs can be configured uniformly, sBSs in a cHCN can bequite differently configured in different clusters in terms ofthe number of antennas, the transmission scheme, the resourcepartitioning and scheduling strategy, and the backhaul qualitywhether it allows a cell cooperation or not, etc. Thus, it isnot suitable for a practical system such as the LTE-A tocollect such information on each sBS at a centralized networkcontroller and introducing a JP-CoMP into a joint load balanc-ing scheme in a cHCN is not straightforward and is actuallyvery challenging. Note that the LTE-A cellular network mayinclude a small-cell layer cluster mobility management entity(C-MME) and cluster gateway (C-GW) to control sBSs in asmall-cell cluster so that JP-CoMP can be favored but thecore network considers each small-cell cluster as an mBS [31].Therefore, such a core network architecture should be takeninto account and a novel joint association scheme suitable fora practical cHCN is required.

In this paper, a joint UA scheme with JP-CoMP usinga hybrid self-organizing network (SON) is proposed for apractical cHCN to maximize the network-wide proportionalfairness among users. In this scheme, a central SON algorithmmanages a macroscopic user association and a distributedlocal SON algorithm in each cluster manages a joint UAwith an RP scheme by considering adaptive CoMP modeselection for given user locations. A network architecture andprotocol for the cHCN is suggested and a feasible suboptimaliterative algorithm for determining the joint UA solution ofthe proposed hybrid SON is provided. It is shown that theproposed hybrid SON scheme with the proposed joint UAsolution is very effective in handling the load balancing ina practical cHCN. This comes from the fact that the proposedscheme not only improves the performance of the inner sBSusers by reducing the inter-cell interference, especially forintra-tier offloaded users, but also enables more aggressiveinter-tier offloading by effectively improving the link qualityof cluster edge users without causing an unnecessary resourcewaste. The novelty and main contribution of this paper can besummarized as follows.

i) When considering an efficient load balancing schemefor a cHCN, JP-CoMP needs to be considered among smallcells, especially with a non-static BS grouping to alleviatethe problems of group-edge users as previously described.However, such a joint UA problem with a dynamic BSgrouping is intractable. In this paper, such a difficulty is solvedby suggesting a joint UA problem with a semi-dynamic BSgrouping and resource partitioning and proposing its subopti-mal solution at a polynomial-time complexity, and

2( )! c1

mb

2 ,1

s

cb

2 ,2

s

cb 2 ,3

s

cb

2u1u8u

9u

11u

3

mb 1

s

cb 3 ,2

s

cb

3 ,3

s

cb

3u4u 10u

12u14u

1( )! c

3( )! c2

mb

3

1,1

s

cb 1,2

s

cb

1 ,3

s

cb

3 ,1c 3 ,3c

5u6u 7u

13u

1( )! c

" # " # " #m m m m$ $ $C c c c U u u u B b b b" # " # " #

" # " # " #1 2 3

1 2 3 1 2 14 1 2 3

,1 ,2 ,3 1 2 3

, , , , ,..., , , , ,

, , for , , , , .m m m

s s s s

$ $ $

$ % $& $ $c c c c b b b

C c c c U u u u B b b b

B b b b c C C C c C c c

Fig. 1. The cHCN model.

ii) When considering the implementation feasibility, existingschemes are not appropriate because a centralized schemesuch as in [22] suffers from impractical signalling overheadand computational load at the central entity. Although it maybe implemented in a distributed way as in [19], it involvesiterations between BSs and users using wireless resource ina synchronized manner, which is not suitable for a practicalsystem, either. In this paper, a hybrid approach is proposedwhere iterations are performed among core network entitiesonly, which is more suitable for a practical implementation.Also, the implementation feasibility of the proposed schemeis checked for a typical scenario.

The remaining of this paper is organized as follows. InSection II, the system model is presented. In Section III,the proposed hybrid architecture and protocol are illustrated.In Section IV, the proposed joint load balancing scheme ispresented. In Section V, simulation results and correspondingdiscussions are provided and concluding remark is given inSection VI.

II. THE CHCN MODEL

Consider a downlink two-tier cHCN in which macro cellsare overlaid with (clustered) small cells using a lower transmitpower as shown in Fig. 1. According to realistic scenarioswhere there could be some hotspots with different user densi-ties [32] and an operator deploys both macro-cell BSs (mBSs)and sBSs to provide coverage flexibly and promptly [4], morethan one sBSs can be installed in hotspot areas [5][6]. Eachcluster, denoted as ξ(c), is modeled as a closed region centeredat c and the set of cluster center locations is denoted asC = {c1, c2, ...} which is assumed to follow a point processΦc with density of λc. Also, denote the set of user locationsas U = {u1,u2, ...} and it is assumed to be a mixture pointprocess Φu∪ ⋃

c∈CΦuc , where Φu is a point process with density

of λu and Φuc is a point process over ξ(c) with density ofλh. Note that we focus on an operator-installed small-celldeployment scenario with an open access strategy and trustedbackhauls. As shown in Fig. 1, it is assumed that an operatordeploys macro-cell BSs (mBSs) network-widely and sBSs ineach cluster. Denote the set of mBS locations and the setof sBS locations in cluster c as Bm = {bm1 ,bm2 , ...} and

0018-9545 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2017.2748570, IEEETransactions on Vehicular Technology

LOAD BALANCING SCHEME WITH SMALL-CELL COOPERATION FOR CLUSTERED HETEROGENEOUS CELLULAR NETWORKS, J.-B. PARK AND K. S. KIM 3

(a) Hybrid SON for a two-tier cHCN

(b) sBS clustering phase for a two-tier cHCN

Fig. 2. Network architecture for a two-tier cHCN

Bsc =

{bsc,1,b

sc,2, ...

}, respectively. Here, Bm is assumed to

follow a point process Φm with density of λm and Bsc is

assumed to follow a point process Φsc over ξ(c) with densityof λs(c).

The core network architecture is assumed as shown in Fig.2-(a). Here, sBSs are assumed to be clustered and the sBSs foreach cluster are connected via wired backhaul lines to the C-MME and the C-GW as in [33], which may be assumed to beco-located and connected via X2 with the nearest mBS. Also,each mBS, C-MME, and C-GW are connected to the enhancedpacket core (EPC) composed of the MME, the serving gateway(S-GW), and the packet data network gateway (P-GW). Notethat in [31], a home eNode B (HeNB) gateway is defined inthe 3GPP specification and it can play a role as a C-MMEwith an S1 interface between C-MMEs and an EPC-MME.Then, HeNBs connected to the same HeNB gateway naturallyform a small-cell cluster. Thus, such a cHCN model and thecorresponding network architecture can model a practical LTE-A heterogeneous network. For the abstracted air-interface, itis assumed that one frame is comprised of NS subframesand each subframe consists of NRB multiple resource blocks(RBs), where the user association and BS grouping can beupdated for each frame if necessary due to user mobility.

Considering the case where previously installed small-cellsor user-installed open access small-cells exist together in suchhotspots, the sBS clustering phase shown in Fig. 2-(b) isperformed by EPC-MME. Here, based on the reported usermeasurement information {γu}, EPC-MME can transform γuinto approximated relative distance between the user and thecorresponding sBS and find the relative location of each sBSby utilizing a least-square source location estimation, such

as the total least square algorithm [34]. Then, the sBSs areclustered by using a widely-used clustering algorithm, such asthe k-Means algorithm [35] with the elbow method [36].

The SON model for a joint user association is also shownin Fig. 2. Here, each user u reports its measurement γu to theEPC-MME via its currently associated BS. In an LTE system,the user measurement report γu is delivered as the neighborcell measurement report if a predefined event trigger criterionis met, in which such a criterion can be set according to theservice provider’s strategy, the cell locations, etc. Also, the C-MME in cluster c informs the EPC-MME of its cluster uplinkinformation Γc. The EPC-MME also informs the C-MMEin cluster c of its cluster downlink information Ψc. Basedon {γu|u ∈ U} and {Γc|c ∈ C}, the EPC-MME performsits SON function and delivers the updated user associationinformation vector Ib to each BS b and its updated associationinformation Ju of each user u via each currently associatedBS. Note that the above SON model can include a fullycentralized SON scheme such as in [22], a centralized SONscheme implemented by using a distributed computation suchas in [19], and a distributed SON scheme such as the CREand e-ICIC scheme with the network-widely selected optimalbias and ABS ratio values.

In a centralized SON scheme such as in [22], γu is ameasured average SNR value set from its neighboring BSsand Γc is the full information required for evaluating theexpected throughput of each user if associated to each cell incluster c according to the number of antennas, the transmissionscheme, the resource partitioning and scheduling strategy, thebackhaul quality, etc. Based on them, the EPC-MME jointlydetermines Ib and Ju for all b ∈ Bm ∪ ⋃

c∈CBs

c and u ∈ U

and delivers them to each BS and C-MME via Ψc. Finally,each BS delivers the association information to its associatedusers. The centralized SON scheme can achieve the maximumperformance of the joint user association but the cluster uplinkinformation {Γc|c ∈ C} and the computational load at theEPC-MME become so high that it is inappropriate to beapplied in practice. Although such a centralized SON maybe implemented by using a distributed computation, such asin [19], in which γu is the association request from user uand Ju is the set of BS-specific information such as the priceinformation or the allocated ABS ratio of the neighboring BSsrequired for user u to determine its association. Also, Γc andΨc denote the information from cluster c to the EPC-MMEaccording to the user requests in cluster c and the informationfor the cluster c to update its BS-specific information by con-sidering others, respectively. Then, an iterative SON procedureis performed among all the entities so that each user candetermine its associated BS in a distributed manner. However,although its computational complexity and backhaul overheaddo not cause any problem, all the involved entities should besynchronized and wireless resources need to be wasted duringthe distributed iteration among the BSs and users, which isalso not suitable in practice. On the other hand, a distributedSON can be easily implemented in practice by performingUA locally and exchanging some measured statistics and thenetwork-wide bias and ABS ratio values as the cluster uplink

0018-9545 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2017.2748570, IEEETransactions on Vehicular Technology

LOAD BALANCING SCHEME WITH SMALL-CELL COOPERATION FOR CLUSTERED HETEROGENEOUS CELLULAR NETWORKS, J.-B. PARK AND K. S. KIM 4

TABLE ISUMMARY OF THE NOTATIONS.

Symbol Description

C, U, Bm, BscSets of clusters, users,

mBSs, and sBSs in cluster c

γu, Γc, Ψc, Ib, Ju

measurement for user u, clusteruplink information for cluster c,cluster downlink information for

cluster c, UA information forBS b, and the association

information for user u

BuSets of neighboring BSs

for user u

Cd

Set of clusters partitionedto mBS d determined

at the EPC-MME

Bc, Ac

BS group information and UAinformation determined atthe C-MME in cluster c

Btc ={Btc,j

}, Atc =

{At

c,j

} BS group collection and UAvector collection for resource

t in cluster c

Btc,j , Atc,j =

[At,χ

c,j

]χ∈X

the jth BS group and the jthUA set vector for resource

t in cluster c

At,χc,j

the jth UA set for resource tusing transmission scheme

χ in cluster c

Ib =[Ib,t

]t∈[T ]

, Ju = [Ju,t]t∈[T ]

UA vector for BS b,associated BS group vector

for user u

Ib,t=[Iχb,t

]χ∈X

, Ju,t=[Jχu,t

]χ∈X

UA vector for BS b,associated BS group vector

for user u for resource t

Iχb,t, Jχu,t

users for BS b and associatedBS group for user u using

transmission scheme χfor resource t

Ud, Xd, ςd

Sets of associated usersand association variables

for an mBS or a cluster d andABS ratio for mBS d ∈ Bm

determined at the EPC-MME

Γc = {Qc,∇c,∆c}the cluster uplink information

for C-MME in cluster c

and downlink informations, respectively. However, the loadbalancing capability is expected to be low, especially in acHCN.

III. HYBRID SON ARCHITECTURE AND PROTOCOL

In this paper, a 2-level hybrid SON scheme is considered.In this scheme, the EPC-MME performs the macroscopic UAbased on the user reported information {γu|u ∈ U} and thecluster uplink information {Γc|c ∈ C} from the C-MME ineach cluster and delivers the cluster downlink information{Ψc|c ∈ C}. On the other hand, the C-MME in each clusterdetermines the joint UA for its local cluster based on its localinformation and updates the cluster uplink information, whichare performed in an iterative manner.

For the proposed hybrid SON, each user u measures theaverage SNR values γu,b of its neighboring BSs b ∈ Bu ⊂Bm ∪ ⋃

c∈CBs

c to form γu = {γu,b|b ∈ Bu} and reports it

to the EPC-MME via its currently associated BS. Based on{γu|u ∈ U}, the EPC-MME decides the macroscopic userassociation {Ud,Xd, ςd|d ∈ Bm ∪C}, where Ud ⊂ U,Xd = {xu,d|u ∈ Ud}, and xu,d, ςd ∈ [0, 1] denote the userassociation variable of user u and the ABS ratio of an mBS

,1

s

cb

sb

7u2

u 3u 6

u,1

s

cb

sb

7u2

u 3u 6

u,

sb

,4

s

cb

1u

7

,2

s

cb

4u

2

8u

u

5u

sb

,4

s

cb

1u

7

,2

s

cb

4u

2

8u

u

5u

! "

,3cb4

u9

u

t=1 t=2

,3cb4

u9

u

! "1 2 9, ,...,#

cU u u u

! ",1 ,2 ,3 ,4, , ,

s s s s s#c c c c c

B b b b bZFBF

MRT ! "1 2 9, , ,

c

Fig. 3. A toy example of the joint UA scheme with JP-CoMP.

or cluster d, respectively, and delivers Ud, Xd, and ςd toeach mBS (d ∈ Bm) or to each C-MME (d ∈ C) by settingΨd = [Ud, ςd] for d ∈ Bm and Ψd = [Ud,Xd, ϕd] ford ∈ C, where ϕd denotes the ABS ratio information of anmBS around d ∈ C, which will be defined later. Here, theEPC-MME partitions the clusters according to each mBS as{Cd|d ∈ Bm} ∈ ρ (C), where ρ (A) denotes the collectionof all partitions of a set A, so that a common ABS ratio isused among each mBS and sBSs in its associated clusters, i.e.,ςc = ςd for c ∈ Cd.

In this paper, JP-CoMP with a semi-dynamic BS groupingand resource partitioning is considered to alleviate the problemof group-edge users and improve the load balancing capabilityfor a cHCN. In the cluster c of interest, the sBSs in Bs

c

are partitioned to form disjoint BS groups1 and each user inUc is allocated to one of the subframe types, which involvesallocating each user to one of the BS groups for each resourcein the selected subframe type. Also, each BS group is assumedto be able to serve its users by using transmission schemeχ ∈ X = {Z,M}, where χ = Z and χ = M denotethe ZFBF [29] and MRT [30] schemes, respectively. DenoteBtc =

{Bt

c,1,Btc,2, ...

}∈ ρ (Bs

c) as the BS group set andAtc =

{At

c,1,Atc,2, ...

}as the corresponding user association

set for resource t, respectively, where Atc,j =

[At,χ

c,j

]χ∈X

and At,χc,j ⊂ Uc. Here, Bt

c,j and At,χc,j denote the jth BS

group element of Btc and the jth user association set ofAtc using transmission scheme χ ∈ X , respectively. Then,Bc =

[B1c,B2

c, ...,BTc]

and Ac =[A1

c,A2c, ...,ATc

]describe

the joint UA with a BS grouping for cluster c determinedby its C-MME, where T denotes the number of differentresources2. Note that such a resource partitioning into Tresources is cluster-wise, i.e., all sBSs and users in a clus-ter share the same RP synchronously. On the other hand,each BS group can choose its current transmission schemeamong X independently. Let Ib = [Ib,1, Ib,2, ..., Ib,T ] andJu = [Ju,1,Ju,2, ...,Ju,T ] denote the user association vectorfor BS b and the associated BS group vector for user u,respectively, where Ib,t = [Iχb,t]χ∈X and Ju,t =

[Jχu,t

]χ∈X

denote the users associated to a BS group which BS b belongsto and served by using transmission scheme χ ∈ X at resourcet and the set of BSs that form the BS group which serves user

1In case of not using a CoMP, each BS group contains only one BS.2The typical e-ICIC can be considered as the T = 2 case (the NS for t = 1

and the ABS for t = 2).

0018-9545 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2017.2748570, IEEETransactions on Vehicular Technology

LOAD BALANCING SCHEME WITH SMALL-CELL COOPERATION FOR CLUSTERED HETEROGENEOUS CELLULAR NETWORKS, J.-B. PARK AND K. S. KIM 5

u by using the transmission scheme χ ∈ X at resource t,respectively. Then, the C-MME delivers Ib to each b ∈ Bs

c

and Ju to each u ∈ Uc by setting Iχb,t = At,χc,πb(t) and Jχu,t =

Btc,κχu(t)

, where πb(t) = argj∈{1,2,...,|Btc|}

{1{b∈Btc,j} = 1} and

κχu(t) = argj∈{1,2,...,|Btc|}

{1{u∈At,χc,j} = 1}. Also, the C-MME

delivers its cluster uplink information to the EPC-MME bysetting Γc = {Qc,∇c,∆c}, where Qc, ∇c, and ∆c denote theparameter set for the pre-determined approximated user rateevaluation, the parameter set for the linear approximation ofthe residual metric error, and the corresponding trust regions,respectively.

In each mBS b ∈ Bm, the ABS pattern for each frame isdetermined according to ςb and shares it to the sBSs of theclusters in Cb via the corresponding C-MMEs. Each mBSb selects a user among Ub by using a scheduling algorithmsuch as the proportional fair scheduling (PFS) [37] during itsnormal subframe (NS) while it keeps silent during its ABS. Onthe other hand, for each resource t in each cluster c, each BSgroup B ∈ Btc (or its chief BS) can choose the ZFBF schemeor the MRT scheme at each RB. For user scheduling, a set ofusers equal to its group size |B| for the ZFBF scheme and asingle user for the MRT scheme are assumed to be selectedby using a scheduling algorithm such as the PFS method forthe multiuser case [37], respectively.

In Table I, the notations used in this paper are summarized.Also, Fig. 3 shows a toy example of the joint UA schemewith JP-CoMP in a cluster c, using T = 2 resources.Here, 4 sBSs, Bs

c ={bsc,1,b

sc,2,b

sc,3,b

sc,4

}are partitioned

into B1c =

{{bsc,1,b

sc,2

},{bsc,3,b

sc,4

}}for t = 1 and

B2c =

{{bsc,1,b

sc,4

},{bsc,2,b

sc,3

}}for t = 2 as described.

Also, each of the 9 cluster users, Uc = {u1,u2, ...,u9}, isassociated to one of the BS groups in each resource as A1

c ={[{u3,u4,u5} , {u1,u2}] , [{u6,u9} , {u7,u8}]} for t = 1and A2

c = {[{u5,u6,u7,u8} , {u3}] , [{u1,u4,u9} , {u2}]}for t = 2 as described in Fig. 3. Note that Ibsc,1 and Ju1

aregiven as Ibsc,1 = [[{u3,u4,u5} , {u1,u2}] , [{u5,u6,u7,u8} ,{u3}]] and Ju1 =

[[∅,{bsc,1,b

sc,2

}],[{bsc,2,b

sc,3

}, ∅]]

, re-spectively.

IV. PROPOSED LOAD BALANCING SCHEME IN A CHCNA. Network-wide fairness metric in a cHCN

The proportional fairness scheduling, originally proposed in[38], has been widely considered not only in literature but alsofor real implementations of wireless cellular networks [39][40]due to not only its ability to schedule users in their peakchannel states while providing balanced throughput amongusers but also its implementation feasibility. Thus, in thispaper, the network-wide proportional fairness among users isconsidered as the network utility for the load balancing in acHCN, similarly as in [19]-[24], which can be written as

Υ =∑

u∈U

U (Ru), (1)

where Ru denotes the average rate of user u and U (r) denotesthe general α-fairness utility function for a rate r. Here,U (r) = log(r) for α = 1, and U (r) = r1−α/ (1− α) for

α 6= 1, α > 0. First, we focus only on the case where α = 1,i.e., the proportional fairness [41] is considered. In order toperform a joint UA to improve the network utility, Ru needsto be anticipated from the reported information and may beapproximated as

Ru=

(1− ςd)φd (u) Iu,d, u∈Ud,d∈Bm,(2)

T∑

t=1

|Btd|∑

j=1

χ∈Xηtdν

t,χd,jφ

t,χd,j (u)It,χu,d

(Bt

d,j

), u∈Ud,d∈C, (3)

where Iu,d denotes the rate of user u from mBS d if sched-uled, φd (u) denotes the probability that user u is scheduledby mBS d, ηtd denotes the portion that resource t takes, νt,χd,j

denotes the portion that BS group j takes for transmissionscheme χ at resource t, φt,χd,j (u) denotes the probability thatuser u is scheduled by BS group j using transmission schemeχ at resource t, and It,χu,d(Bt

d,j) denotes the rate of user ufrom BS group j using transmission scheme χ at resource tif scheduled.

For the macro-cell users, the rate of user u from mBS d ifscheduled, Iu,d, can be approximated as

Iu,d =

qd

(log2

(1 +

Su(d)Wu(d)

)+(

Tu(d)+Vu(d)

(Su(d)+Wu(d))2− Vu(d)

W 2u(d)

)log2e

),

Su(d)Wu(d) ≥ γth,

0, o.w.,(4)

where qd∆= min

(NA

d , |Ud|)

denotes the number of users thatcan be scheduled simultaneously by mBS d, Su(d)

∆= kdθd,

Wu(d)∆=∑

b∈Bu−{d} γu,b, Tu(d)∆= kdθd

2, and Vu(d)∆=∑

b∈Bu−{d} γ2u,b denote the average signal power, the average

interference power, the variance of the signal power, and thevariance of the interference power, respectively. Here,

kd∆=

{c1dN

Ad FA

−1(p (µd) ;NA

d

), 1 <NA

d < |Ud| ,NA

d − qd + 1, o.w.,(6)

denotes the shape parameter for the signal power when itis modeled as a Gamma random variable as in [42], whereNA

d denotes the number of antennas at the macro-cell BSd, FA(x;NA

d ) = 1 − (1− x)NAd −1 denotes the cumulative

distribution function (CDF) of squared absolute inner productbetween the normalized channel vector of size NA

d and thecorresponding zero forcing precoding vector of size NA

d

for a randomly selected user, p (µd) =µd/(µd+1) denotes apercentile value corresponding to the largest value among µd

values in a CDF, and

µd∆=

{l(|Ud| ;NA

d

), 1 < |Ud|

NAd< 2,

|Ud| , o.w.,(7)

denotes the effective number of users considering multiuserMIMO (ZFBF). Here, l

(x;NA

d

)=(2− 1/NA

d

) (x−NA

d

)+1

denotes a linear transformation, by which NAd < x < 2NA

d

is mapped to 1 <l(x;NA

d

)< 2NA

d . Note that the shape pa-rameter gain for the signal power due to the use of multipleantennas is reflected similarly as in [43]. In addition,

θd∆= mdγu,d/N

Ad (8)

0018-9545 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2017.2748570, IEEETransactions on Vehicular Technology

LOAD BALANCING SCHEME WITH SMALL-CELL COOPERATION FOR CLUSTERED HETEROGENEOUS CELLULAR NETWORKS, J.-B. PARK AND K. S. KIM 6

It,χu,d

(Bt

d,j

)=

qt,χd,j

(log2

(1 +

Sχu(Btd,j)W t

u(Btd,j)

)+

(Tχu (Btd,j)+V tu(Btd,j)

(Sχu(Btd,j)+W tu(Btd,j))

2 − V tu(Btd,j)(W t

u(Btd,j))2

)log2e

),

Sχu(Btd,j)W t

u(Btd,j)≥ γth,

0, o.w.(5)

denotes the scale parameter for the signal power when it ismodeled as a Gamma random variable, where

md∆=

((c2d)1{NAd >1}F−1

(p (µd) ;NA

d ,(NA

d

)−1))1{|Ud|>NAd }

(9)reflects the multiuser diversity gain due to PFS sim-ilarly as in [44] and F (x; k, θ) = γ

(k, xθ

)/Γ (k),

γ(k, xθ

)=∫ x/θ

0tk−1e−tdt. Note that in (6) and (9), two

constants c1d and c2d are introduced to reflect the correlationeffect between the shape parameter gain for the signal powerdue to the use of multiple antennas and the multiuser diversitygain obtained from the instantaneous channel gain of theselected signal of the ZFBF due to PFS. Here, the valuesare determined as follows. Let An(1) and Bn(1) denote thelargest value among A1, A2..., An generated from a randomvariable A and among B1, B2..., Bn generated from a ran-dom variable B, respectively, and An〈1〉 and Bn〈1〉 denote thecorresponding values of A and B for the largest one amongA1B1, A2B2, ..., AnBn. Then, the values of c1d and c2d aredetermined as c1d=E[An〈1〉]/E[An(1)] and c2d=E[Bn〈1〉]/E[Bn(1)]

for a random variable A with the CDF of FA(x;NAd ) and a

Gamma random variable B ∼ Γ(NAd , (N

Ad )−1

) denoting thenormalized channel gain, similarly as in [42].

Note that (4) adopts the Gamma distribution approximationas in [42] and is further modified by considering PFS, inwhich the total interference power can be approximated asthe sum of the average powers from interfering BSs so thatthe first term represents the rate expected from the averagedcombined signal power and the averaged interference powerand the remaining term denotes the rate due to the variationson the signal and interference powers, which comes fromthe approximated expression on the digamma function [42].Here, the signal power is approximated as a Gamma randomvariable as in [42] with the modified shape and scale parameterconsidering the multiuser diversity due to PFS.

For the small-cell cluster users, (3) considers differentcombinations of RP and CoMP. By adopting the Gamma dis-tribution approximation again, It,χu,d(Bt

d,j) can be similarly ap-

proximated as in (5). Here, qt,χd,j∆= min(|Bt

d,j |1{χ=Z} , |At,χ

d,j |)denotes the number of users that can be scheduled simulta-neously, Sχu(Bt

d,j)∆= kt,χd,jθ

t,χd,j , W

tu(Bt

d,j)∆=∑

b∈Bt,ju,dγu,b,

Tχu (Btd,j)

∆= kt,χd,j(θ

t,χd,j)

2, and V tu(Btd,j)

∆=∑

b∈Bt,ju,dγ2u,b

denote the average signal power, the average interferencepower, the variance of the signal power, and the variance ofthe interference power, respectively. Here,

θt,χd,j∆= md,χ

(Bt

d,j

) ∑

b∈Btd,j

γu,b/∣∣Bt

d,j

∣∣ (10)

denotes the scale parameter for the signal power when it ismodeled as a Gamma random variable, where

md,χ

(Bt

d,j

) ∆=((c4d,j

)d1·d2F−1

(p(µt,χd,j

); εt,χd,j ,

(εt,χd,j

)−1))d3

,

(11)with d1 = 1{χ=Z}, d2 = 1{|Btd,j |>1}, and d3 =1{|At,χ

d,j|>|Btd,j |d1} reflects the multiuser diversity gain due toPFS by introducing a constant c4d,j at the effective numbersof users and BSs considering CoMP, given by

µt,χd,j∆=

l(∣∣∣At,χ

d,j

∣∣∣ ,∣∣∣Bt

d,j

∣∣∣), if d1 · d4 = 1,∣∣∣At,χ

d,j

∣∣∣ o.w.,(12)

with d4 = 1{|At,χd,j |<2|Btd,j |}

and

εt,χd,j∆=∣∣∣At,χ

d,j

∣∣∣−1 ∑

u∈At,χd,j

b∈Btd,j

γu,b/ maxb∈Btd,j

γu,b

, (13)

respectively. In addition,

kt,χd,j∆=

c3d,j

∣∣∣Btd,j

∣∣∣FA−1(p(µt,χd,j

);∣∣∣Bt

d,j

∣∣∣), if d1 ·d2 · d3 = 1,∣∣∣Bt

d,j

∣∣∣− qt,χd,j + 1, o.w.,

(14)denotes the shape parameter for the signal power when itis modeled as a Gamma random variable, where the shapeparameter gain is reflected similarly as in (6). Here, Bt,j

u,d∆=

Bu − ({b (d)} ∪Btd,j) for t ∈ TA, Bt,j

u,d∆= Bu − Bt

d,j

for t ∈ TN , TN (TA) denotes the set of resources usingan NS (or an ABS), and b (d) denotes the mBS to whichcluster d is associated. Similarly as in (4), the first term in(5) represents the rate expected from the averaged combinedsignal power and the averaged interference power due to thecooperation and the remaining term denotes the rate due tothe variations on the signal and interference powers. In caseof MRT with PFS, since a signal is transmitted towards itschannel direction, the shape parameter is given as the numberof antennas or the number of cooperating sBSs, similarlyas in [42], while the scale parameter needs to be modifiedto reflect the multiuser diversity gain. One reasonable wayis to magnify the original scale parameter in [42] by theamount of the multiuser diversity gain obtained from the useof PFS as in (10), assuming that µt,χd,j users are competingwith their normalized channel gains. On the other hand, incase of ZFBF with PFS, the multiuser gain due to PFSmay be introduced as follows. Let A be a random variablewith the cdf of FA(x; |Bt

d,j |) denoting the squared absoluteinner product between the normalized channel vector andthe corresponding zero forcing precoding vector using CoMPamong BSs in Bt

d,j for a randomly selected cluster user andB ∼ Γ(|Bt

d,j |, |Btd,j |−1

) be a random variable denoting the

0018-9545 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2017.2748570, IEEETransactions on Vehicular Technology

LOAD BALANCING SCHEME WITH SMALL-CELL COOPERATION FOR CLUSTERED HETEROGENEOUS CELLULAR NETWORKS, J.-B. PARK AND K. S. KIM 7

{X

(n)d , ς

(n)d

}= arg max{Xd,ςd}

u∈U

( ∑

d∈Bmxu,dU

(φd (u)

∣∣∣U(n−1)d

∣∣∣ Ru

)+∑

d∈C

xu,dU(φd (u)

∣∣∣U(n−1)d

∣∣∣ Ru,d

(Q

(n−1)d

)))

+∑

d∈Bm(ςd − ς(n−1)

d )∑

c∈Cd

µ(n−1)c +

d∈C

u∈U(n−1)d

σ(n−1)u,d (xu,d − x(n−1)

u,d ). (15)

normalized channel gain assuming that the effective numberof BSs is |Bt

d,j |. Then, the values of c3d,j and c4d,j are deter-mined as c3d,j=E[An〈1〉]/E[An(1)] and c4d,j=E[Bn〈1〉]/E[Bn(1)],respectively, for reflecting the correlation effect between theshape parameter gain for the signal power due to the use ofmultiple BSs and the multiuser diversity gain obtained fromthe instantaneous channel gain of the selected signal of theZFBF (or MRT) CoMP due to PFS in the shape and scaleparameters similarly as in the macro-cell case.

B. EPC-MME operation

If the EPC-MME can obtain the full information onthe transmission scheme and the resource partitioning withscheduling strategy for each possible BS grouping set oneach resource of each cluster c ∈ C, (2) and (3) can beevaluated from {γu} so that the EPC-MME can determine{Ud|d ∈ Bm ∪ ⋃

c∈CBs

c} ∈ ρ(U) and {ςd|d ∈ Bm} as well

as the resource partitioning and BS grouping in each clusterto maximize the network-wide proportional fairness amongusers. However, although the joint UA can achieve an optimalsolution, the EPC-MME suffers from formidable signalingoverhead and computational complexity.

On the other hand, the proposed scheme does not require thefull information and only the macroscopic UA is performedat the EPC-MME as described in Section III. For this, it isassumed that another approximated user rate Ru,c, instead ofRu in (3), is used. Although any good Ru,c can be applied,

Ru,c (Qc) = maxχ∈X(

ξc,χ,AItA,χu,c

(ωu

(Sc,χ

))+ ξc,χ,NI

tN ,χu,c

(ωu

(Sc,χ

))),(16)

for the cluster uplink information from cluster c, denoted asQc =

{Sc,χ, kc,χ,mc,χ, ξc,χ,A, ξc,χ,N |χ ∈ X

}, is assumed,

where the values for Sc,χ, kc,χ mc,χ, ξc,χ,A, and ξc,χ,N comefrom the result of the previous iteration at each C-MME,ωu (S) = {b ∈ Bu∩Bs

c|rank(γu,b, {γu,b′ |b′ ∈ Bu∩Bsc}) ∈

S}, and rank(a,A) for a ∈ A denotes the rank (in adescending order) of a in A. Although such parameters forRu,c are selected to fit Ru well in each C-MME and deliveredto the EPC-MME, there remains a residual error in evaluatingthe network-wide fairness metric and a linear approximationon the residual metric error is additionally taken into account,in which the corresponding parameter sets are also determinedand delivered from C-MMEs during the previous iteration.

At the nth iteration, the EPC-MME performs the macro-scopic UA by considering each small-cell cluster as an mBSwith approximated user rates for each small-cell cluster users.The residual error in the network-wide fairness metric caused

by using such approximated user rates is further compensatedby using the cluster uplink information. Denoting xu,d as amacroscopic UA variable for user u to the macro-cell BS orcluster d, the macroscopic UA problem at the nth iterationcan be written as in (15) with the following constraints:

d∈Bm∪C

xu,d = 1, xu,d ∈ [0, 1] ,u ∈ U,d ∈ Bm ∪C, (17)

∥∥∥xu,d − x(n−1)u,d

∥∥∥ ≤ δ(n−1)d ,u ∈ U

(n−1)d ,d ∈ C, (18)

∥∥∥ςd−ς(n−1)d

∥∥∥≤min{ε(n−1)c |c ∈ Cd

}, ςd∈ [0,1],d∈Bm,(19)

where X(n)d ={x(n)

u,d|u ∈ U(n)d }, U

(n)d =

{u′ ∈ U

(n)d | x

(n)u′,d ≥

0.5},

U(n)d =

{u∈U| max

χ∈X,t∈TA

Sχu

(ωu

([∣∣∣S(n−1)d,χ

∣∣∣]))

W tu

(ωu

([∣∣∣S(n−1)d,χ

∣∣∣])) ≥ γth

}, for d ∈ C,

{u ∈ U| Su(d)

Wu(d)≥ γth

}, for d ∈ Bm,

(20)and φd (u) can be given by

φd (u) = r(n)u,d/

u′∈U

xu′,dr(n)u′,d, (21)

where

r(n)u,d =

(∣∣∣U(n−1)d

∣∣∣ Ru

) 1−αα

, for d ∈ Bm,(∣∣∣U(n−1)

d

∣∣∣ Ru,d

(Q

(n−1)d

)) 1−αα

, for d ∈ C,

(22)similarly as in [41]. Here, the first part of (15) is the approxi-mated metric using Ru,c and the remaining part compensatesthe residual error by using a linear approximation. Also, theconstraint (17) is for a relaxed single-BS association as in [19],the constraints (18) and (19) are for limiting the macroscopicuser association variables {xu,d} and the ABS ratio variables{ςd} within their trust regions3 for this iteration determinedby C-MMEs, respectively, and X

(n)d and U

(n)d denote the set

of the macroscopic user association variables and the set of ef-fective users for d ∈ Bm∪C at the nth iteration, respectively.In addition, Q(n)

c = {S(n)c,χ , k

(n)c,χ,m

(n)c,χ, ξ

(n)c,χ,A, ξ

(n)c,χ,N |χ ∈ X},

∇(n)c = [{σ(n)

u,c|u ∈ U(n)c }, δ(n)

c ], and ∆(n)c = [α

(n)c , β

(n)c ]

denote the cluster uplink information from cluster c duringthe nth iteration, where S(n)

c,χ , k(n)c,χ, m(n)

c,χ, ξ(n)c,χ,A, and ξ

(n)c,χ,N

denote the typical rank set, the shape parameter, the mul-tiuser diversity gain of the scale parameter, the schedulingprobability at an ABS, and the scheduling probability atan NS using transmission scheme χ, respectively, σ(n)

u,c andδ

(n)c denote the gradients of the residual metric error due to

3The trust region concept is adopted from [48].

0018-9545 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2017.2748570, IEEETransactions on Vehicular Technology

LOAD BALANCING SCHEME WITH SMALL-CELL COOPERATION FOR CLUSTERED HETEROGENEOUS CELLULAR NETWORKS, J.-B. PARK AND K. S. KIM 8

{Y(n)

c ,Z(n)c ,W(n)

c , η(n)c , ν(n)

c

}=

argmax{Yc,Zc,Wc,ηc,νc}

u∈U(n)c

U

t∈[T ]

j∈[|Btc|]

χ∈Xyt,χu,c,jφ

t,χc,j (u) ηtcν

t,χc,j I

t,χu,c

(Bt

c,j

)+wu,cφ

mc (u)

(1−ς(n)

m(u)

)Iu,m(u)

.(23)

the changes in the user association variable xu,c and in thecorresponding ABS ratio variable ςc, respectively, and α

(n)c

and β(n)c denote the suggestions on the half length of the

edge of the cubic trust region according to the user associationvariable and the ABS ratio, respectively. Note that the aboveproblem is not convex but it has a special structure thatlets it become convex in {xu,d|u ∈ U,d ∈ Bm ∪C} for agiven {ςd|d ∈ Bm} and vice versa similarly as in [19]. Thus,{xu,d|u ∈ U,d ∈ Bm ∪C} and {ςd|d ∈ Bm} can be foundby fixing each other and using a convex programming toolsuch as CVX [45] iteratively.

Finally, the cluster downlink informationΨ

(n)c = [U

(n)c ,X

(n)c , ϕ

(n)c ] is updated and delivered

to each cluster, where ϕ(n)c =

{(ς

(n)d , N

(n)d,c ) |d =

m (u) for u ∈ U(n)c or d = b such that c ∈ Cb},

N(n)d,c =

∑u∈U(n)

d −U(n)c

r(n)u,d, and m(u)= arg max

d∈Bm∩Bu

γu,d.

C. C-MME operation 1: joint UA and resource partitioning

Based on the cluster downlink information, the C-MMEneeds to determine the joint UA and resource partition-ing for T resources, i.e., Ac, ηc = {ηtc}t∈[T ], νc ={νt,χc,j

}t∈[T ],j∈[|Btc|],χ∈X

for a pre-determined Bc. Also, the

cluster uplink information Γ(n)c needs to be updated.

As described in Section II, JP-CoMP with a semi-dynamicBS grouping and resource partitioning is adopted for thejoint UA in each C-MME to improve the load balanc-ing capability in a cHCN. Let zu,c ∈ {0, 1} denote theABS subframe association indicator in cluster c for useru, wu,c ∈ {0, 1} denote the macro-cell association indi-cator in cluster c for user u, yt,χu,c,j ∈ {0, 1} denote theuser association indicator of user u for BS group j usingtransmission scheme χ at resource t in cluster c, Zc ={zu,c|u ∈ U

(n)c

}denote the set of the ABS subframe associ-

ation indicators in cluster c, Wc ={wu,c|u ∈ U

(n)c

}denote

the set of the macro-cell association indicators in cluster c,and Yc =

{yt,χu,c,j |∀u ∈ U

(n)c ,∀χ ∈ X,∀t ∈ [T ] ,∀j ∈ [|Btc|]

}

denote the set of the user association indicators in cluster c.Here, φt,χc,j (u) in (3) can be given by

φt,χc,j (u) = rt,χu,c,j/∑

u′∈U(n)c

yt,χu′,c,jrt,χu′,c,j , (24)

where

rt,χu,c,j =(ηtcν

t,χc,j I

t,χu,c

(Bt

c,j

)) 1−αα , (25)

similarly as in [41]. Then, a dynamic optimization problem atthe nth iteration in the C-MME of cluster c can be written asin (23) with the following constraints:

yt,χu,c,j , zu,c, wu,c ∈ {0, 1} , zu,c+wu,c ≤ 1,u ∈ U(n)c ,

χ ∈ X, t ∈ [T ] , j ∈[∣∣Btc

∣∣], (26)∑

j∈[|Btc|]

χ∈Xyt,χu,c,j ≤ zu,c,u ∈ U(n)

c , t ∈ TA, (27)

j∈[|Btc|]

χ∈Xyt,χu,c,j ≤ 1− wu,c − zu,c,u ∈ U(n)

c , t ∈ TN ,(28)

χ∈Xνt,χc,j = 1, νt,χc,j ∈ [0, 1] , t ∈ [T ] , χ ∈ X, j ∈

[∣∣Btc∣∣], (29)

t∈TA

ηtc = ς(n)b(c),

t∈TN

ηtc = 1− ς(n)b(c), η

tc ∈ [0, 1], t ∈ [T ],(30)

where φmc (u) can be given by

φmc (u) = r(n)u,m(u)/

N (n)

m(u),c+∑

u′∈U(n)c ,m(u′)=m(u)

wu′,cr(n)u′,m(u′)

,

(31)where

r(n)u,m(u) =

((1− ς(n)

m(u)

)Iu,m(u)

) 1−αα

, (32)

similarly as in [41]. Here, the first part of (23) is the expectedrate from the small-cell cluster and the second part is theexpected rate from the neighboring mBSs. Also, the constraint(26) is for the single-BS association so that zu,c+wu,c shouldbe less than or equal to 1, the constraint (27) is for the ABSsubframe association such that the sum of the associationindicators yt,χu,c,j of user u at each resource t ∈ TA shouldbe consistent to zu,c (less than zu,c if not associated to theresource or equal to zu,c if associated to the resource), theconstraint (28) is for the NS subframe association in a similarway, the constraint (29) is for setting the sum of the portionsaccording to all possible transmission schemes to 1 for eachBS group, and the constraint (30) is for setting the sum ofthe resource portions according to ABS and NS to ς(n)

b(c) and

1− ς(n)b(c), respectively. Note that the above problem in (23) is

a non-convex mixed-integer nonlinear programming problemand is NP-hard as stated in [46]. Thus, finding its optimalsolution is intractable.

In order to efficiently find a decent suboptimal solution,a semi-dynamic approach is taken, in which it is assumedthat each ABS (NS) user is associated to all the ABS (NS)resources. Then, equality is instead used in (27) or (28),similarly as in [47], and only an adequate portion for eachresource needs to be determined. By relaxing the constraints

0018-9545 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2017.2748570, IEEETransactions on Vehicular Technology

LOAD BALANCING SCHEME WITH SMALL-CELL COOPERATION FOR CLUSTERED HETEROGENEOUS CELLULAR NETWORKS, J.-B. PARK AND K. S. KIM 9

(P1){Y(n,i)

c ,Z(n,i)c ,W(n,i)

c , ν(n,i)c

}= argmax{Yc,Zc,Wc,νc}

u∈U(n)c

t∈[T ]

j∈[|Btc|]

χ∈Xyt,χu,c,jU

(φt,χc,j (u) ηtcν

t,χc,j I

t,χu,c

(Bt

c,j

))+ wu,cU

(φmc (u)

(1− ς(n)

m(u)

)Iu,m(u)

), (33)

s.t.zu,c + wu,c ≤ 1, yt,χu,c,j , zu,c, wu,c ∈ [0, 1] ,u ∈ U(n)c , χ ∈ X, t ∈ [T ] , j ∈

[∣∣Btc∣∣] , (34)

j∈[|Btc|]

χ∈Xyt,χu,c,j =

zu,cmax (|{t ∈ TA|ηtc > 0}| , 1)

,u ∈ U(n)c , t ∈ TA, (35)

j∈[|Btc|]

χ∈Xyt,χu,c,j =

1− wu,c − zu,cmax (|{t ∈ TN |ηtc > 0}| , 1)

,u ∈ U(n)c , t ∈ TN , (36)

χ∈Xνt,χc,j = 1, νt,χc,j ∈ [0, 1] , t ∈ [T ] , χ ∈ X, j ∈

[∣∣Btc∣∣] . (37)

(P2) η(n,i)c = argmax

ηc

u∈U(n)c

U

t∈[T ]

j∈[|Btc|]

χ∈Xyt,χu,c,jφ

t,χc,j (u) ηtcν

t,χc,j I

t,χu,c

(Bt

c,j

), (38)

s.t.∑

t∈TA

ηtc = ς(n)b(c),

t∈TN

ηtc = 1− ς(n)b(c), η

tc ∈ [0, 1] , t ∈ [T ] . (39)

yt,χu,c,j , zu,c, wu,c ∈ {0, 1} in (26) with yt,χu,c,j , zu,c, wu,c ∈[0, 1] and applying a series of lower bounds utilizing theJensens inequality and the inequality of arithmetic and ge-ometric means, a lower bound for (23) is obtained, whichhas a multi-convex structure as in (15) and the originalproblem is similarly separated into the iterations between twosubproblems. The proposed optimization problem at the nthiteration in the C-MME of cluster c is written as in (33)-(39).Note that (P1) and (P2) are the subproblems of the originalone in the ith iteration for given latest solutions from (P2) and(P1) in the (i− 1)th iteration, respectively4. Here, (P1) has aspecial structure that lets it become convex in {Yc,Zc,Wc}or νc by fixing the other and vice versa and (P2) is convexin ηc. Thus, a suboptimal solution can be obtained by solving(P1) and (P2) iteratively by using a convex programming tool[45] with a rounding procedure, which is summarized in Fig.4.

Then, each sBS in cluster c partitions the resource accordingto ηc synchronously and At,χ

c,j can be determined as At,χc,j =

{u ∈ U(n)c |yt,χu,c,j = 1}. In each resource t, each BS group j

uses transmission scheme χ with the portion νt,χc,j and duringthe portion, users in At,χ

c,j are scheduled according to the PFSstrategy and served by the transmission scheme χ.

D. C-MME operation 2: cluster uplink information update

In order to update the cluster uplink information Γ(n)c ,

the C-MME first determines the parameter set Q(n)c =

{S(n)c,χ , k

(n)c,χ,m

(n)c,χ, ξ

(n)c,χ,A, ξ

(n)c,χ,N |χ ∈ X} for the approximated

user rate as

4Here, the original PF metric for U(n)c is used in (P2).

{S(n)c,χ , k

(n)c,χ,m

(n)c,χ, ξ

(n)c,χ,A, ξ

(n)c,χ,N

}= arg min{Sc,χ,kc,χ,mc,χ,ξc,χ,A,ξc,χ,N}

u∈U(n)c

∣∣∣∣∣Ru

ξc,χ,AItA,χu,c

(ωu

(Sc,χ

))+ξc,χ,NI

tN ,χu,c

(ωu

(Sc,χ

))−1

∣∣∣∣∣

2

.(40)

Since finding the best solution to the above problem mightnot be critical for the overall optimization, a suboptimalsolution can be adopted in this paper, which is summarized inFig. 5. First, the effective number of users for the transmissionscheme χ in cluster c, denoted as µ

(n)c,χ, and the effective

number of BSs for the multiuser diversity, denoted as ε(n)c ,

which are necessary to evaluate the shape parameter and themultiuser diversity gain for Ru,c(Q

(n)c ) in the EPC-MME, are

assumed to be given as the largest number of the associatedusers among the BS groups in cluster c and the mean of theeffective number of BSs for the multiuser diversity gain overall users in cluster c, respectively, as shown in step 1. In step 2,several variables are set, in which c3c,χ and c4c,χ are determined

for a random variable A with the CDF of FA(x;∣∣∣S(n)

c,χ

∣∣∣)

and a

random variable B ∼ Γ(|S(n)c,χ |, |S(n)

c,χ |−1

), similarly as in IV-A,dχ,1 (dχ,2) denotes the mean value of the allocated resourceportion for ABS (NS) and dχ,3 (dχ,4) denotes the mean valueof the number of the associated users for each group in ABS(NS). Then, by assuming that BSs for each user are given asthe rank set S(n)

c,χ , the shape parameter k(n)c,χ and the multiuser

diversity gain m(n)c,χ are determined as in step 3, similarly as

in (14) and (12). Also, the effective scheduling portion ξ(n)c,χ,A

(ξ(n)c,χ,N ) for ABS (NS) for each transmission scheme χ is set

to the mean value of the allocated resource portion divided bythe mean value of the number of the associated users for each

0018-9545 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2017.2748570, IEEETransactions on Vehicular Technology

LOAD BALANCING SCHEME WITH SMALL-CELL COOPERATION FOR CLUSTERED HETEROGENEOUS CELLULAR NETWORKS, J.-B. PARK AND K. S. KIM 10

1: Step 0 (Initialization): ηtc = ς(n)b(c)/ |TA| for t ∈ TA and ηtc =

(1− ς

(n)b(c)

)/ |TN | for t ∈ TN .

Also, set i = 0, ε = 10−2 and Υ(n,0)c = −∞.

2: Step 1: Increase i by 1 and solve P1 at a given η(n,i−1)c .

3: Step 2: Solve P2 at a given{Y

(n,i)c ,Z

(n,i)c ,W

(n,i)c , ν

(n,i)c

}.

4: Step 3: Calculate the PF metric for cluster c, Υn,ic =

∑u∈U(n)

c

U(Ru).

5: Step 4: Go to the the next step if∣∣∣Υ(n,i)

c −Υ(n,i−1)c

∣∣∣ ≤ ε. Otherwise, go back to Step 1.

6: Step 5: Set rt,χu,c,j = yt,χu,c,jφt,χc,j (u) η

tcν

t,χc,j I

t,χu,c

(Btc,j

)as the expected rate for user u from the

jth BS group with transmission χ at resource t for cluster c.

7: Step 6: Set r(n)u,c = wu,cφ

mc (u)

(1− ς

(n)m(u)

)Iu,m(u) as the expected rate for user u from the

neighboring mBS.

8: Step 7: Select χ and j having the maximum expected rate at each resource t ∈ [T ].9: Step 8: If

∑t∈TA

maxj∈[|Bt

c|],χ∈Xrt,χu,c,j ≥ ∑

t∈TNmax

j∈[|Btc|],χ∈X

rt,χu,c,j and∑t∈TA

maxj∈[|Bt

c|],χ∈Xrt,χu,c,j ≥ rmu,c,

set yt,χu,c,j= 1 for each selected χ and j at each resource t ∈ TA,

else if∑t∈TA

maxj∈[|Bt

c|],χ∈Xrt,χu,c,j <

∑t∈TN

maxj∈[|Bt

c|],χ∈Xrt,χu,c,j and

∑t∈TN

maxj∈[|Bt

c|],χ∈Xrt,χu,c,j ≥ rmu,c,

set yt,χu,c,j= 1 for each selected χ and j at each resource t ∈ TN .

10: Step 9: Set U(n)c =

{u ∈ U(n)

c | ∑t∈[T ]

∑j∈[|Bt

c|]

∑χ∈X

yt,χu,c,j> 0

}as the effective user set for cluster

c.

11: Step 10: Refine{η

(n)c , ν

(n)c

}= argmax

ηc,νc

∑u∈U(n)

c

U

(∑t∈[T ]

∑j∈[|Bt

c|]

∑χ∈X

rt,χu,c,j

)with the constraints

in (23) and (24), which is a convex problem for each variable by fixing the other so that

a suboptimal solution can be obtained by solving it iteratively with a convex programming

tool.

Fig. 4. Optimization procedure at each C-MME.

group in ABS (NS). Finally, the rank set S(n)c,χ is selected to

fit Ru.Also, define the residual error at the nth iteration as

ϕ(n)c =

u∈U(n)c

x(n)u,c

U

(φc (u) g

(X

(n)c

)Ru

)

−U(φc (u)

∣∣∣U(n)c

∣∣∣ Ru,c

(Q

(n)c

)),

(41)where g(X

(n)c ) =

∑u∈U(n)

c(1 + e−c(x

(n)u,c−0.5))

−1for some

constant c > 0 is a smooth function of x(n)u,c approximating

the number of associated users and φc (u) can be given by

φc (u) = R(1−α)/αu /

u′∈U(n)c

R(1−α)/αu′ x

(n)u′,c, (42)

similarly as in [41]. In a typical trust region algorithm solvinga very complex problem such as in [48], an approximatedmodel (such as a linear approximation or a quadratic ap-proximation) is used to obtain the trust region trial stepand the ratio between the actual reduction (increment) inthe original function and the predicted reduction (increment)in the approximated model within a trust region with anacceptable trial step needs to be greater than a given trustregion threshold value (say 0.75). Note that (41) is just forthe currently given U

(n)c and X

(n)c and the whole function

for the residual error is not available at each C-MME. Inorder to compute a proper trust region for EPC-MME, thetrust region concept [48], which provides how to computethe trust region trial step and decide whether a trial step isacceptable or not, is applied. In this paper, it is assumedthat a linear approximation model is used to obtain the trustregion trial step and the original function is approximated bya quadratic approximation using the Taylor series of (41).As a result, an approximated ratio between the reductions

1: Step 1: Calculate the effective number of users for the transmission scheme χ, µ(n)c,χ, and the

effective number of BSs for the multiuser diversity gain, ε(n)c as

µ(n)c,χ = max

t∈[T ],j∈[|Btc|]

u∈U(n)c

yt,χu,c,j ,

ε(n)c =∣∣∣U(n)

c

∣∣∣−1 ∑

u∈U(n)c

b∈βu

(S(n)c,χ

)γu,b/ max

b∈βu

(S(n)c,χ

) γu,b

.

2: Step 2: Set

c3c,χ = E[An〈1〉

]/E

[An

(1)

], c4c,χ = E

[Bn〈1〉

]/E

[Bn

(1)

],

dχ,1 =∑

t∈TA,ηtc>0

j∈[|Btc|]νt,χc,j /

t∈TA,ηtc>0

∣∣Btc

∣∣,

dχ,2 =∑

t∈TN ,ηtc>0

j∈[|Btc|]νt,χc,j /

t∈TN ,ηtc>0

∣∣Btc

∣∣,

dχ,3 =∑

u∈U(n)c

t∈TA,ηtc>0

j∈[|Btc|]yt,χu,c,j/

t∈TA,ηtc>0

∣∣Btc

∣∣,

dχ,4 =∑

u∈U(n)c

t∈TN ,ηtc>0

j∈[|Btc|]yt,χu,c,j/

t∈TN ,ηtc>0

∣∣Btc

∣∣.

3: Step 3: Update k(n)c,χ, m(n)

c,χ, ξ(n)c,χ,A, and ξ(n)c,χ,N by using the solutions for (P1) and (P2) as

follows:

k(n)c,χ =

(c3c,χ

∣∣S(n)c,χ

∣∣(1−

(1 + µ(n)

c,χ

)a))b

, a = −(∣∣S(n)

c,χ

∣∣− 1)−1

, b = 1{χ=Z}1{∣∣∣S(n)c,χ

∣∣∣>1},

m(n)c,χ = F−1

(µ(n)c,χ/

(µ(n)c,χ + 1

); ε(n)c ,

(c4c,χ

)b/ε(n)c

),

ξ(n)c,χ,A = dχ,1ς

(n)b(c)/dχ,3,

ξ(n)c,χ,N = dχ,2

(1− ς

(n)b(c)

)/dχ,4.

4: Step 4: Finally, update the rank set S(n)c,χ by using (34), which may be obtained by using an

exhaustive search for small-sized clusters and by using a simple greedy algorithm, otherwise.

Fig. 5. Procedure for determining Q(n)c .

σ(n)u,c = ∂ϕ

(n)c

∂x(n)u,c

= U(φc (u) g

(X

(n)c

)Ru

)− U

(φc (u)

∣∣∣U(n)c

∣∣∣ Ru,c

(Q

(n)c

))+

u′∈U(n)c

x(n)

u′,c

(p(n)

u′,c (u)− q(n)

u′,c (u))

,

δ(n)c = ∂ϕ

(n)c

∂ς(n)b(c)

=∑

u∈U(n)c

x(n)u,c

(s(n)u,c

(∑

t∈[T ]

ξtu,c

)− t

(n)u,c

(Ru,c

(Q

(n)c

)∣∣∣ς(n)b(c)

=1− Ru,c

(Q

(n)c

)∣∣∣ς(n)b(c)

=0

)),

σ(n)u,c = ∂2ϕ

(n)c

∂[x(n)u,c

]2 = 2(p(n)u,c (u)− q

(n)u,c(u)

)+

u′∈U(n)c

x(n)

u′,c

(p(n)

u′,c (u)− q(n)

u′,c(u))

,

δ(n)c = ∂2ϕ

(n)c

∂[ς(n)b(c)

]2 =∑

u∈U(n)c

x(n)u,c

(s(n)u,c

(∑

t∈[T ]

ξtu,c

)2

− t(n)u,c

(Ru,c

(Q

(n)c

)∣∣∣ς(n)b(c)

=1− Ru,c

(Q

(n)c

)∣∣∣ς(n)b(c)

=0

)2),

α(n)c = min

{2∣∣∣σ(n)

u,c

∣∣∣e(1)th∣∣∣σ(n)u,c

∣∣∣

∣∣∣∣u ∈ U(n)c

}, β

(n)c =

2∣∣∣δ(n)

c

∣∣∣e(2)th∣∣∣δ(n)c

∣∣∣,

where

p(n)

u′,c(u) =∂U

(φc(u

′)g(X

(n)c

)R

u′)

∂x(n)u,c

, p(n)

u′,c(u) =∂2U

(φc(u

′)g(X

(n)c

)R

u′)

∂[x(n)u,c

]2 ,

q(n)

u′,c(u) =∂U

(φc(u

′)∣∣∣U(n)

c

∣∣∣Ru′,c

(Q

(n)c

))

∂x(n)u,c

, q(n)

u′,c(u) =∂2U

(φc(u

′)∣∣∣U(n)

c

∣∣∣Ru′,c

(Q

(n)c

))

∂[x(n)u,c

]2 ,

s(n)u,c =

∂U(φc(u)g

(X

(n)c

)Ru

)

∂Ru, s(n)

u,c =∂2U

(φc(u)g

(X

(n)c

)Ru

)

∂R2u

,

t(n)u,c =

∂U(φc(u)

∣∣∣U(n)c

∣∣∣Ru,c

(Q

(n)c

))

∂Ru,c

(Q

(n)c

) , t(n)u,c =

∂2U(φc(u)

∣∣∣U(n)c

∣∣∣Ru,c

(Q

(n)c

))

∂R2u,c

(Q

(n)c

) ,

ξtu,c =∂ηt

c

∂ς(n)b(c)

∂Ru∂ηt

c=

(−1)1{t∈TN}ηt

c1{t∈TA}

∑t∈TA

ηtc+1{t∈TN}

∑t∈TN

ηtc

∑j∈[|Bt

c|]∑

χ∈Xνt,χc,j φ

t,χc,j (u) I

t,χu,c

(Bt

c,j

).

Fig. 6. Equations for determining ∇(n)c and ∆

(n)c .

(increments) of the quadratic and linear approximations usingthe Taylor series of (41) is instead used. For pre-determinedtrust region thresholds e(1)

th , e(2)th > 0, the approximated ratio

is obtained from σ(n)u,c and σ

(n)u,c (δ(n)

c and δ(n)c ), which are

the first and second derivatives of ψ(n)c in (41) with respect

to x(n)u,c (ς(n)

b(c)), respectively, and ∇(n)c =[{σ(n)

u,c|u ∈ U(n)c },δ(n)

c ]

and ∆(n)c =[α

(n)c , β

(n)c ] are updated as summarized in Fig. 6.

0018-9545 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2017.2748570, IEEETransactions on Vehicular Technology

LOAD BALANCING SCHEME WITH SMALL-CELL COOPERATION FOR CLUSTERED HETEROGENEOUS CELLULAR NETWORKS, J.-B. PARK AND K. S. KIM 11

E. Complexity analysis and implementation feasibility for theproposed hybrid SON scheme

On the one hand, consider the asymptotic time complexityrequired to solve the optimization problems in (15) for anEPC-MME and in (33)-(39) for a C-MME according tothe number of users. The optimization problem in (15) forthe EPC-MME is not convex but it has a special structurethat it becomes convex in {xu,d|u ∈ U,d ∈ Bm ∪C}for a given {ςd|d ∈ Bm} and vice versa. Thus,{xu,d|u ∈ U,d ∈ Bm ∪C} and {ςd|d ∈ Bm} can befound by fixing each other iteratively. To solve eachconvex subproblem for {xu,d|u ∈ U,d ∈ Bm ∪C} or{ςd|d ∈ Bm}, the interior point method [49] utilizing aniterative Newton step can be used, in which for a convexproblem with size of n, a matrix inversion with complexityof O

(n2.3

)floating-point operations is required to compute a

new Newton step for each iteration [50] and O (√n log (n))

iterations are required [49]. Since a finite number ofiterations are required between the two subproblems, thetime complexity required for the optimization problem inthe EPC-MME is given as O

(|U|2.8 log |U|

). Similarly,

the optimization problem (P1) in (33)-(37) for the C-MMEis not convex but it has a special structure that it becomesconvex in {Yc,Zc,Wc} or νc by fixing the other. Thus, eachsubproblem for {Yc,Zc,Wc} or νc can be solved similarlyas in the EPC-MME case. In addition, (P2) in (38) and (39)is convex and a finite number of iterations are performedbetween (P1) and (P2), which leads to the time complexity

of O(∣∣∣U(n)

c

∣∣∣2.8

log∣∣∣U(n)

c

∣∣∣)

.

On the other hand, consider the implementation feasibilityof the proposed scheme. In a typical LTE system, there arehundreds to thousands of macro cells connected to an EPC-MME [51] and there are about 200 simultaneously radioresource control (RRC)-connected users per each macro cell[52]. Also, a typical hotspot area can be characterized by itsuser density about 10 times higher than that in a normal macro-cell area and its area of about 104m2 [32]. If we assume onemBS per 1km2 and several sBSs in each hotspot area, thenumber of simultaneously RRC-connected users handled byeach C-MME is about a few tens. Suppose that the proposedscheme is used for the determination and update of the targeteNode B for the user plane of each RRC-connected user.Then, the allowed control plane latency in the EPC-MMEfor an RRC-connection request is about 15ms [53]. Then,it seems impractical to handle up to hundreds of thousandsof users in an EPC-MME. However, only users in a smallportion of macro cells neighboring hotspots need to participatein the proposed scheme. Also, among such users, users witha dominant reference signal received power (RSRP) from acell can be pre-determined. Finally, the EPC-MME can dividethe whole problem into geographically-divided independentsubproblems comprised of neighboring mBSs and hotspotclusters. For an example of a typical urban city [54], 10%of measurement points are classified as hotspot points so thatthe number of macro cells overlaid with hotspots in an EPC-MME is at most a few hundreds. If these macro cells are

divided into several tens of disjoint groups, about a thousandRRC-connected users need to be jointly considered in theproposed UA scheme for each group in the EPC-MME. Sinceonly a part of users are located in edge area, users with adominant RSRP from a cell can be automatically associatedand the effective number of users for the proposed schemecan be much smaller. Note that a general convex problem canbe solved iteratively by approximating an original problemto a quadratic program (QP) problem and an open sourcesuch as CVXGEN [55] can generate a fast custom code forQP-representable convex optimization. Thus, by assuming adedicated processor for the computation in each EPC-MMEor C-MME, the proposed scheme can be easily implementedin a practical cellular system, which will be confirmed by anumerical example in Section V.

V. SIMULATION RESULTS

In order to evaluate the advantages of the proposed jointUA scheme with JP-CoMP using hybrid SON, denoted as“H-SON”, a two-tier network is considered, where Φc foreach cluster ξ(c) is assumed to follow a Matern hard-coreprocess of type 3 with density of λc = 8 × 10−6 generatedfrom a homogeneous Poisson point process (HPPP) and ξ(c)is assumed to be a circle centered at c with the radius ofRc = 65(m) to set about 10 % coverage of the total areaas hotspot areas to reflect a typical hotspot area reported in[32][56]. Also, for user locations, it is assumed that Φu is anHPPP with density of λu = 5 × 10−4(users/m2) and Φuc isan HPPP over ξ(c) with density of λh = 10λu to reflect atypical user density ratio reported in [32].

In order to serve users in the above cHCN environment,it is assumed that a service provider deploys macro-cell BSsequipped with NA

d ∈ {1, 2, 4}, d ∈ Bm, antennas with densityof λm = 1×10−6 and single-antenna sBSs, 10 in average, foreach small-cell cluster, i.e., πRc2λs(c) = 10. The minimumdistance between a macro-cell BS and a cluster center and thatbetween neighboring cluster centers are set to 1.5Rc and 3Rc,respectively, for a practical cHCN scenario, which followsthe 3GPP recommendations and its evaluation methodologyparameters for small-cell cluster deployment scenario [57]. Inorder to validate the performance of the proposed scheme,the clustering phase is first performed. Here, based on thereported user measurement information {γu}, EPC-MME cantransform γu into approximated relative distance between theuser and the corresponding sBS as follows:

du,b = (N0/Psγu,b)−1/θ

, (44)

where Ps, N0, and θ denote the transmit power of sBSs, thenoise power, and the pathloss exponent, respectively. Then,using the above distance values between users and sBSs, thedistance between sBSs might be approximated as follows:

db,b′ =1

|Ub,b′ ∪Ub′,b|

u∈Ub,b′

du,b′+∑

u∈Ub′,b

du,b

, (45)

where Ub,b′ =

{u ∈ U|b = arg max

b′′∈Bu

γu,b′′ ,b′ ∈ Bu

}.

Then, by utilizing the total least square algorithm [34] among

0018-9545 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2017.2748570, IEEETransactions on Vehicular Technology

LOAD BALANCING SCHEME WITH SMALL-CELL COOPERATION FOR CLUSTERED HETEROGENEOUS CELLULAR NETWORKS, J.-B. PARK AND K. S. KIM 12

{{X∗d, ς∗d|d ∈ Bm} , {X∗d,Y∗d,Z∗d, η∗d, ν∗d|d ∈ C}}= arg max{{Xd,ςd|d∈Bm},{Xd,Yd,Zd,ηd,νd|d∈C}}

u∈U

U

d∈Bmxu,dφd (u) (1− ςd) Iu,d +

d∈C

t∈[T ]

j∈[|Btd|]

χ∈Xyt,χu,d,jφ

t,χd,j (u) ηtdν

t,χd,jI

t,χu,d

(Bt

d,j

). (43)

randomly selected sBSs repeatedly, the relative location ofeach sBS can be estimated and the k-Means algorithm [35]follows to cluster the sBSs, in which the effective number ofclusters is determined by using the elbow method [36]. Here, itis assumed that the role for C-MME for each assigned clusteris supported by its nearby mBS and the sBSs in the associatedclusters are connected to the corresponding C-MME. Theperformance results of the proposed scheme in Figs. 7-12 areobtained with the automatically clustered sBSs. The transmitpowers of the mBSs and sBSs are set to 46 dBm and 30 dBm,respectively, the noise power spectral density is set to -174dBm/Hz, and the system bandwidth is assumed to be 10 MHz.By reflecting the fact that a macro-cell has a relatively largercoverage than a small-cell, the required average SINR levelof the system is set to −9dB for macro-cell users and −6dBfor small-cell cluster users by considering QPSK with coderate 193/1024, which corresponds to the LTE channel qualityinformation (CQI) 3 with maximum 3 re-transmissions formacro-cell users and 1 re-transmission for small-cell clusterusers [58][59]. Also, it is assumed that each frame consistsof NS = 10 subframes and each subframe of 1ms intervalconsists of NRB = 100 multiple RBs as in the LTE and thechannel for each RB between each antenna of each BS andeach user is assumed to be an independent flat Rayleigh fadingchannel with 128.1+37.6log10 (R) dB for macro-cell pathlossmodel and 140.7 + 36.7log10 (R) dB for small-cell pathlossmodel, in which R denotes the distance between a BS and auser in [km].

For comparison, a centralized joint UA scheme employingthe same semi-dynamic approach for small-cell clusters, de-noted as “C-SON”, is considered to provide an upper-boundon the performance of the proposed H-SON scheme for eachgiven realization. Note that all information is assumed to beavailable at the EPC-MME so that the expected user rate canbe directly calculated at the EPC-MME. However, it is hard tobe implemented in practical scenarios due to its high signallingoverhead and computational load at the EPC-MME. Such anoptimization problem for C-SON can be written as in (43)with the following constraints:

d∈Bm∪Cxu,d = 1, xu,d ∈ [0, 1] ,u ∈ U, (46)

j∈[|Btd|]

χ∈Xyt,χu,d,j =

zu,d|{t ∈ TA|ηtd > 0}|, t∈TA,u∈U,d∈C, (47)

j∈[|Btd|]

χ∈Xyt,χu,d,j =

xu,d − zu,d|{t ∈ TN |ηtd > 0}|, t∈TN,u∈U,d∈C,(48)

t∈TA

ηtd = ς(n)

b(d),∑

t∈TN

ηtd = 1− ς(n)

b(d), t ∈ [T ] , ∀d ∈ C, (49)

χ∈Xνt,χd,j = 1, νt,χd,j ∈ [0,1] , χ∈X, t∈ [T ] , j∈

[∣∣Btd∣∣] ,d∈C. (50)

Here, a suboptimal solution is obtained by transforming theabove problem similarly as in (23) and solving the transformedproblem with a convex programming tool [45] iteratively andthe same rounding procedure used in Section IV-B is appliedfor each cluster. Note that the C-SON can be considered asa relaxed version of the original combinatorial problem sothat it provides an upper-bound on the performance of theproposed H-SON scheme for each given realization since thesame relaxation and rounding method are applied except thata suboptimal distributed method is used for the proposed H-SON. In addition to the joint UA scheme, three conventionalsingle-cell based association schemes, the centralized joint UAwithout considering CoMP [19], the CRE and e-ICIC scheme[5], and the maximum SNR scheme [27][28], are compared,which are denoted as “SC-SON”, “SD-SON”, and “S-MAX”,respectively. Also, “SD-SON” is optimized by selecting thecommon bias and ABS ratio values found by an exhaustivesearch to maximize the PF metric. Note that, although UAis differently performed in each of the above five schemes,the same MU-MIMO or CoMP schemes are utilized with thesame scheduling policy in every scheme in order to comparethe performance according to the load balancing strategy.

In Fig. 7, the convergence performance of the proposedscheme for a typical cHCN realization is shown. In orderto give the performance bound of the proposed scheme,the performance of C-SON is plotted by using a dottedline with no mark. From the results, it is shown that theperformance of the proposed scheme approaches that ofthe “C-SON”, i.e., the upper bound, only within severalexchanges of Γc and Ψc between EPC-MME and C-MMEs,which implies that the proposed iterative algorithm worksvery well5. It is also shown that the convergence performanceis affected by the selection of the cluster uplink informationso that it needs to be selected carefully. Here, “H-SON,bad S”, “H-SON, 0.3eth”, “H-SON, different hat R, 10eth”,and “H-SON, user pre-classification” denote the H-SONscheme with a bad selection of the rank set Sc,χ, thatwith e

(1)th = 0.9 and e

(2)th = 0.06, that using Ru,c(Qc) =

maxχ∈X

(0.01ξc,χ,AItA,χu,c (βu(Sc,χ)) + ξc,χ,NI

tN ,χu,c (βu(Sc,χ)))

instead of (16) with e(1)th = 30 and e

(2)th = 2, and that with

5In case of an update where only a part of users move or are replaced,only a few exchanges of Γc and Ψc between EPC-MME and C-MMEs arerequired for each update of the joint association.

0018-9545 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2017.2748570, IEEETransactions on Vehicular Technology

LOAD BALANCING SCHEME WITH SMALL-CELL COOPERATION FOR CLUSTERED HETEROGENEOUS CELLULAR NETWORKS, J.-B. PARK AND K. S. KIM 13

1 2 3 4 5 6 7 8 9 10 11 12

Number of Iteration

0.145

0.146

0.147

0.148

0.149

0.15

0.151

Harm

onic

Mean [bps/h

z]

C-SON

H-SON, eth

=(3,0.2)

H-SON, different hat R, 10eth

H-SON, 0.3eth

H-SON, bad S

H-SON, user pre-classification

H-SON, discrete event

0 5 10 15 20 25 30 35 40 45

0.08

0.09

0.1

0.11

0.12

0.13

0.14

0.15

Fig. 7. The convergence performance of the proposed H-SON.

pre-classified effective users, respectively. Comparing themwith the proposed H-SON using the cluster uplink informationin Fig. 5 shows that the convergence speed is significantlydegraded if the rank set is not properly selected. This comesfrom the fact that the residual error becomes larger anddeviated from the linear approximation so that not only muchmore iterations are required but also the point of convergenceitself is deviated from the optimal point. In addition, the trustregion thresholds need to be properly selected for a goodconvergence performance. In this simulation, the trust regionthreshold values of e(1)

th = 3 and e(2)th = 0.2 are selected by a

trial and error because the performance is not much sensitiveto the values if the proposed Ru,c (Qc) is used. However,if we use e

(1)th = 0.9 and e

(2)th = 0.06 instead, i.e., use a

too small trust region, the convergence speed may becomesignificantly degraded even if a properly approximatedRu,c (Qc) is used. On the other hand, we may want toincrease the convergence speed by enlarging the trust region.However, as shown in the curve denoted as “H-SON, differenthat R, 10eth”, if the mismatch is caused by selecting not sogood Ru,c, the convergence speed may be increased but thepoint of convergence can be deviated from the optimal point.Also, users with a dominant RSRP can be pre-determinedas discussed earlier to reduce the complexity. As shown inFig. 7, it is shown that similar performance can be achievedwhen users with a dominant RSRP (6 dB threshold and 7 dBbias for sBSs) are pre-determined. In this simulation, about80% of users were pre-determined and only 20% of usersparticipated in the proposed scheme. Lastly, in order to showthe effect of user mobility and asynchronous measurementreports, a discrete event simulation is performed, in whichit is assumed that previous user locations are obtainedfrom the current user locations by a random displacementfollowing a 2-dimensional Gaussian distribution with themean of 0 and the standard deviation of 20 (m) and that themeasurement report of each user follows a Poisson randomprocess with the mean arrival rate of 0.1 per second. It is

−750 −700 −650 −600 −550 −500 −450−650

−600

−550

−500

−450

−400

−350

sBS

user

UA

Users associated toa macro−cell

(a) S-MAX

−750 −700 −650 −600 −550 −500 −450−650

−600

−550

−500

−450

−400

−350

sBS

user

UA

Users associated toa macro−cell

Range expansoinin outer cells only

Higher loadPoor link quality

(b) SD-SON

−750 −700 −650 −600 −550 −500 −450−650

−600

−550

−500

−450

−400

−350

sBS

user

UA

Users associated toa macro−cell

Limited intra−tieroffloading

(c) SC-SON

−750 −700 −650 −600 −550 −500 −450−650

−600

−550

−500

−450

−400

−350

sBS

user (ZFBF)

BS group

UA

user (MRT)Flexible intra−tieroffloading

Aggressive inter−tieroffloading with MRT

(d) the proposed H-SON

Fig. 8. Association results of S-MAX, SD-SON, SC-SON, and the proposedH-SON for a typical cHCN realization.

also assumed that each iteration takes 4ms. As shown fromthe curve denoted as “H-SON, discrete event” with axes atthe top and the right side in Fig. 7, the performance getsbetter as more measurement information is collected and theproposed scheme works well in the case of user mobility andasynchronous measurement reports.

In Figs. 8 and 9, the association results for the users neara small-cell cluster in a typical cHCN realization are shownwith the corresponding CDFs on the system-wise SINR andthe user-wise average rate. By comparing the user associationresults in S-MAX (Fig. 8-(a)) with SD-SON (Fig. 8-(b)), SC-SON (Fig. 8-(c)), and the proposed H-SON (Fig. 8-(d)), it isclearly shown that i) the load balancing capability of SD-SONis poor in a cHCN so that inner sBSs are hardly expandedand outer sBSs suffer from higher load and poor link quality,ii) although the load balancing capability can be improvedby adopting a joint approach (SC-SON) so that some portionof the load of outer sBSs are offloaded to inner sBSs, theimpact is quite limited because CoMP is not considered at theassociation stage, iii) the proposed H-SON allows not onlymore aggressive inter-tier offloading but also more flexibleintra-tier offloading for better load balancing as expected sothat the load balancing capability can be greatly improved.To confirm that the link quality is managed while aggressiveinter-tier and intra-tier offloadings are allowed in the proposedH-SON, the actual system-wise SINR CDF and that expectedat the association stage when employing the proposed H-SONare compared to those employing S-MAX, SD-SON and SC-SON, respectively, in Fig. 9-(a). From the results, it is shownthat the actual link quality of S-MAX, SD-SON or SC-SONis quite different to that expected at the association stagebecause the conventional schemes do not take the effect ofusing CoMP in small-cell clusters into account so that theload balancing capability becomes quite limited. However, theproposed H-SON can manage the link quality quite accurately

0018-9545 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2017.2748570, IEEETransactions on Vehicular Technology

LOAD BALANCING SCHEME WITH SMALL-CELL COOPERATION FOR CLUSTERED HETEROGENEOUS CELLULAR NETWORKS, J.-B. PARK AND K. S. KIM 14

−6 −4 −2 0 2 4 6 8 100

0.1

0.2

0.3

0.4

0.5

0.6

SINR [dB]

CD

F

small

H−SON, expected

SD−SON, expected

SC−SON, expected

S−MAX, expected

H−SON, actual

SD−SON, actual

SC−SON, actual

S−MAX, actual

(a) Actual and expected link quality distribution

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20

0.1

0.2

0.3

0.4

0.5

0.6

Average rate [bps/hz]

CD

F

H−SON

SD−SON

SC−SON

S−MAX

(b) Average user rate distribution

Fig. 9. Load balancing capability evaluation and comparison.

while much more aggressive offloading is allowed so that theload balancing capability can be greatly improved. The loadbalancing capability of each scheme is evaluated in termsof the average user rate distribution when the macro-cell BSemploys 4 antennas and the maximum BS group size and thenumber of resources for the small-cell layer CoMP are 4 and2, respectively, and is shown in Fig. 9-(b), which confirmsthe superiority of the proposed H-SON over conventionalschemes.

In Fig. 10, the average performance gain of the proposedH-SON over SD-SON in terms of the ratio of the bottom5% average user rate of the proposed “H-SON” over thatof the SD-SON is evaluated for various configurations ofthe maximum BS group size and the number of differentresources. From the results, it is shown that although largergain is achieved as the BS group size and/or the number ofresources increase, the growth rate reduces quickly. This resultexhibits that the proposed H-SON works well in a wide rangeof cluster configurations and BS groups with a few to severalBSs and more than one resource for each NS and ABS areenough. Thus, it is configured that the proposed H-SON isquite suitable for practical scenarios.

In Fig. 11, in order to evaluate the average performancegain of the proposed H-SON over SD-SON in practical cHCNscenarios, the ratios of the bottom 5 %, 10%, and 15 % average

3 4 5 6100

105

110

115

120

125

130

135

140

Maximum Group Size

Av

erag

e p

erfo

rman

ce g

ain

(%

)

(a) # of maximum group size

2 4 6 8 10100

105

110

115

120

125

130

135

140

T

Av

erag

e p

erfo

rman

ce g

ain

(%

)

(b) # of resources

Fig. 10. The average performance gain of the proposed H-SON over SD-SONfor various maximum group sizes and number of resources.

130

140

150

160

170

rform

an

ce

gain

(%

)

S1: baseline cHCN

S2: 50% more sBSs

S3: 50% more clusters

S4: more random configuration

100

110

120

S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4

5% user rate 10% user rate 15% user rate

Avera

ge

per

Fig. 11. The average performance gain of the proposed H-SON over SD-SONin practical cHCN scenarios.

user rates are evaluated and compared from 100 realizationson 5000m × 5000m area. Here, S1 (S4), S2, and S3 denotea basic deployment scenario as stated in the beginning of thissection, the other scenario with 50% more sBSs per cluster,and another scenario with 50% more clusters, respectively, andeach small-cell cluster can be differently configured so that themaximum BS group size for each cluster is randomly pickedamong {2, 3, 4} for scenarios S1, S2, S3 and among {1, 5}for scenario S4. Compared with the average performance gainof the proposed scheme in the baseline scenario S1, that inS2, S3, or S4 becomes higher, i.e., as there is more roomfor load balancing (more clusters or more sBSs per cluster)or more randomness in configuring small-cell clusters, thesuperiority of the proposed scheme increases, which impliesthat the proposed scheme is expected to work well if appliedto a practical cellular network, such as the LTE-A.

In order to extend the proposed scheme to a general α-fairness case (α 6= 1), we may use U (r) = r1−α/ (1− α)instead of U (r) = log(r) in (15), (33), or (38) and thosein Figs. 4 and 6 and update the user scheduling probabilityvariables with the other optimization variables alternately byrelaxing them as independent variables. In Fig. 12, the bottom5 %, 10%, and 15 % average performance gains of theproposed H-SON scheme over SD-SON are evaluated andplotted for α = 0.5, 1, 1.5, and 2 by using the above extensionwhen the maximum group size is set to 4 and T = 2. whichconfirms that although the gain may vary according to aspecific scheduling strategy, the proposed scheme can be wellextended to the case of using a general α-fairness scheduling.

Finally, in order to confirm the implementation feasibility ofthe proposed scheme, a numerical example on the computing

0018-9545 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2017.2748570, IEEETransactions on Vehicular Technology

LOAD BALANCING SCHEME WITH SMALL-CELL COOPERATION FOR CLUSTERED HETEROGENEOUS CELLULAR NETWORKS, J.-B. PARK AND K. S. KIM 15

1 2 3

1: 5% user rate, 2: 10% user rate, 3: 15% user rate

100

105

110

115

120

125

130

135

140

Avera

ge p

erf

orm

ance g

ain

(%

)

=0.5

=1

=1.5

=2

Fig. 12. The average performance gain of the proposed H-SON extended forα-fairness over SD-SON.

time for the proposed UA scheme in a typical LTE scenario isconsidered, where each subproblem in the EPC-MME consistsof 4 mBSs and 4 hotspots each with 10 sBSs for each mBS.Then, there are about 25 (' π × 652 × 10× 200/106) RRC-connected users around each cluster considering that thereare about 200 simultaneously RRC-connected users per eachmacro cell and the higher user density and the area of a typicalhotspot, which results in about 1200 (= 200 × 4 + 25 × 16)RRC-connected users for the above scenario. As shown inFig. 7, it is sufficient to include only part of users (say, 20%)so that about 240 (= 1200 × 0.2) users for each subproblemin the EPC-MME and about 25 users for each C-MME needto be jointly handled. By using CVX (matlab) with an IntelCore i5-4670 (4 cores and 5.18 GFLOPS/core) and assumingparallel processing for independent computation of the ABSratio optimization for each mBS or matrix inversion suchas in [60], the computing times for {Xd|d ∈ Bm ∪C} and{ςd|d ∈ Bm ∪C} are about 0.4s and 4s, respectively, andthose for the iteration between (P1) and (P2) in each C-MMEare about 2.4s. Note that a real-time optimization computingusing CVXGEN is well known to be 500, 2000, or 10000 timesfaster than that using CVX for a large-size, medium-size, orsmall-size problem, respectively [55]. Thus, by assuming astate-of-the-art processor such as Intel Xeon E5-2680 v3 (12cores and 33.6 GFLOPS/core) [61], which has its computingcapacity about 18 (= 3 × 6) times greater than that of theIntel Core i5-4670, the computing times for one iteration in18 subproblems in the EPC-MME and one iteration in eachC-MME would be about 0.9ms and 0.2ms, respectively. Then,by assuming 4 iterations among the EPC-MME and each C-MME and 1ms latency for each cluster information, the EPC-MME control plane latency of about 15ms can be achieved.Also, assuming a pipelined processing among subproblemsin the EPC-MME, it can be implemented by employing adedicated processor in the EPC-MME with current state-of-the-art technologies.

VI. CONCLUSION

In this paper, a joint UA scheme with JP-CoMP using ahybrid SON was proposed for a practical cHCN to max-imize the network-wide proportional fairness among users,

in which a central SON algorithm manages a macroscopicuser association and a distributed local SON algorithm ineach cluster manages a joint UA with an RP scheme byconsidering adaptive CoMP mode selection for given userlocations. The network architecture and protocol for the hybridSON in a cHCN was suggested, which coincides with the corenetwork architecture of the LTE-A and can be easily adoptedin practice, and then a feasible suboptimal iterative algorithmfor determining the joint UA solution of the proposed hybridSON was provided with the time complexity analysis forimplementation feasibility. It is shown that the proposed hybridSON scheme is very effective in handling the load balancingin a practical cHCN not only improving the performance ofthe inner sBS users by reducing the inter-cell interference,especially for intra-tier offloaded users, but also enablingmore aggressive inter-tier offloading by effectively improvingthe link quality of cluster edge users without causing anunnecessary resource waste. Thus, it would be beneficial toapply the proposed solution to a practical cellular network,such as the LTE-A, for better network utilization.

REFERENCES

[1] J. Bagaria and H. Shahnasser, “Meeting challenges of LTE advancedthrough small cell deployment,” in Journal of advances in computernetworks, vol. 3, no. 3, pp. 230-234, September 2015.

[2] M. Mirahsan, R. Schoenen, S. S. Szyszkowicz, and H. Yanikomeroglu,“Measuring the spatial heterogeneity of outdoor users in wireless cellularnetworks based on open urban maps,” in Proc. IEEE Intern. Confer. onCommun. (ICC 2015), pp. 1-5, June 2015.

[3] L. Chiaraviglio, F. Cuomo, M. Maisto, A. Gigli, J. Lorincz, Y. Zhou, Z.Zhao, C. Qi, and H. Zhang, “What is the best spatial distribution to modelbase station density? A deep dive into two european mobile networks,”IEEE Access, vol. 4, pp. 1-10, April 2016.

[4] 3GPP TR 36.842 V12.0.0, “Technical specification group radio accessnetwork; study on small cell enhancements for E-UTRA and E-UTRAN(Release 12),” December 2013.

[5] K. Pedersen, Y. Wang, S. Strzyz, and F. Frederiksen, “Enhanced inter-cell interference coordination in co-channel multi-layer LTE-advancednetworks,” IEEE Trans. on Wireless Commun., vol. 20, no. 3, pp. 120-127,June 2013.

[6] S. Kaneko, T. Matsunaka, and Y. Kishi, “A cell-planning model for hetnetwith CRE and TDM-ICIC in LTE-Advanced,” in Proc. IEEE Veh. Techn.Confer. Spring (VTC 2012), pp. 1-5, May 2012.

[7] G. de la Roche, A. D. L.-P. A. Valcarce, and J. Zhang, “Access controlmechanisms for femtocells,” IEEE Wireless Commun. Mag., vol. 48, no.1, pp. 33-39, January 2010.

[8] T. Nakamura, S. Nagata, A. Benjebbour, Y. Kishiyama, T. Hai, S.Xiaodong, Y. Ning, and L. Nan, “Trends in small cell enhancements inLTE advanced,” IEEE Commun. Mag., vol. 20, no. 3, pp. 120-127, June2013.

[9] B. Bjerke, “LTE-advanced and the evolution of LTE deployments,” inIEEE Wireless Commun., vol. 18, no. 5, pp. 4-5, October 2011.

[10] A. Damnjanovic, J. Montojo, Y. Wei, T. Ji, T. Luo, M. Vajapeyam,T. Yoo, O. Song, and D. Malladi, “A survey on 3GPP heterogeneousnetworks,” IEEE Wireless Commun. Mag., vol. 18, no. 3, pp. 10-21, June2011.

[11] H.-S. Jo, Y. J. Sang, P. Xia, and J. G. Andrews, “Heterogeneous cellularnetworks with flexible cell association: a comprehensive downlink SINRanalysis,” IEEE Trans. on Wireless Commun., vol. 11, no. 10, pp. 3484-3495, October 2012.

[12] R. Madan, J. Borran, A. Sampath, N. Bhushan, A. Khandekar, andJ. Tingfang, “Cell association and interference coordination in hetero-geneous LTE-A cellular networks,” IEEE Journal on Selected Areas inCommun., vol. 28, no. 9, pp. 1479-1489, December 2010.

[13] J. G. Andrews, F. Baccelli, and R. K. Ganti, “A tractable approachto coverage and rate in cellular networks,” IEEE Trans. on WirelessCommun., vol. 59, no. 11, pp. 3122-3134, November 2011.

0018-9545 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2017.2748570, IEEETransactions on Vehicular Technology

LOAD BALANCING SCHEME WITH SMALL-CELL COOPERATION FOR CLUSTERED HETEROGENEOUS CELLULAR NETWORKS, J.-B. PARK AND K. S. KIM 16

[14] H. S. Dhillon, R. K. Ganti, F. Baccelli, and J. G. Andrews, “Modelingand analysis of K-tier downlink heterogeneous cellular networks,” IEEEJournal on Selected Areas in Commun., vol. 30, no. 3, pp. 550-560, April2012.

[15] S. Singh, H. S. Dhillon, and J. G. Andrews, “Offloading in heterogeneousnetworks: modeling, analysis, and design insights,” IEEE Trans. onWireless Commun., vol. 12, no. 5, pp. 2484-2497, May 2013.

[16] J. Ghimire and C. Rosenberg, “Resource allocation, transmission coor-dination and user association in heterogeneous networks: a flow-basedunified approach,” IEEE Trans. on Wireless Commun., vol. 12, no. 3, pp.1340-1351, March 2013.

[17] I. Guvenc, M.-R. Jeong, I. Demirdogen, B. Kecicioglu, and F. Watanabe,“Range expansion and inter-cell interference coordination (ICIC) forpicocell networks,” in Proc. IEEE Veh. Techn. Confer. Fall (VTC 2011),pp. 1-6, September 2011.

[18] D. Lopez-Perez, X. Chu, and I. Guvenc, “On the expanded region ofpicocells in heterogeneous networks,” IEEE Journal on Selected Areas inCommun., vol. 6, no. 3, pp. 281-294, March 2012.

[19] Q. Ye, B. Rong, M. Al-Shalash, C. Caramanis, and J. G. Andrews, “Userassociation for load balancing in heterogeneous cellular networks,” IEEETrans. on Wireless Commun., vol. 12, no. 6, pp. 2706-2716, June 2013.

[20] K. Shen and W. Yu, “Downlink cell association optimization for het-erogeneous networks via dual coordinate descent,” in Proc. IEEE Intern.Confer. on Acoustics, Speech and Signal Processing (ICASSP 2013), pp.4779-4783, May 2013.

[21] T. Zhou, Y. Huang, and L. Yang, “QoS-aware user association for loadbalancing in heterogeneous cellular networks,” in Proc. IEEE Veh. Techn.Confer. Fall (VTC 2014), pp. 1-5, September 2014.

[22] Q. Ye, M. Al-Shalash, C. Caramanis, and J. G. Andrews, “On/offmacrocells and load balancing in heterogeneous cellular networks,” inProc. IEEE Global Commun. Confer. (GLOBECOM 2013), pp. 3814-3819, December 2013.

[23] A. Bedekar and R. Agrawal, “Optimal muting and load balancing foreICIC,” in Proc. 11th Intern. Symp. and Workshops on Modeling andOptimization in Mobile, Ad Hoc and Wireless Networks (WiOpt 2013),pp. 280-287, May 2013.

[24] Y. Jin and L. Qiu, “Joint user association and interference coordinationin heterogeneous cellular networks,” IEEE Wireless Commun. Letters, vol.17, no. 12, pp. 2296-2299, December 2013.

[25] Q. Li, R. Q. Hu, G. Wu, and Y. Qian, “On the optimal mobile associationin heterogeneous wireless relay networks,” in Proc. IEEE Intern. Confer.on Computer Commun. (INFOCOM 2012), pp. 1359-1367, March 2012.

[26] D. Fooladivanda, A. A. Daoud, and C. Rosenberg, “Joint channel allo-cation and user association for heterogeneous wireless cellular networks,”in Proc. IEEE 22nd Intern. Symp. on Personal, Indoor and Mobile RadioCommun. (PIMRC 2011), pp. 384-390, September 2011.

[27] S. V. Hanly, “An algorithm for combined cell-cite selection and powercontrol to maximize cellular spread spectrum capacity,” IEEE Journal onSelected Areas in Commun., vol. 13, no. 7, pp. 1332-1340, September1995.

[28] S. Das, H. Viswanathan, and G. Rittenhouse, “Dynamic load balancingthrough coordinated scheduling in packet data systems,” in Proc. IEEEIntern. Confer. on Computer Commun. (INFOCOM 2003), pp. 786-796,March 2003.

[29] P. Lu and H. Yang, “Sum-rate analysis of multiuser MIMO system withzero-forcing transmit beamforming,” IEEE Trans. on Commun., vol. 57,no. 9, pp. 2585-2589, September 2009.

[30] T. K. Y. Lo, “Maximum ratio transmission,” IEEE Trans. on Commun.,vol. 47, no. 10, pp. 1458-1461, October 1999.

[31] 3GPP TS 36.300 V11.7.0, “Evolved universal terrestrial radio accessnetwork (E-UTRAN); overall description 2,” September 2013.

[32] H. Klessig, V. Suryaprakash, O. Blume, A. Fehske, and G. Fettweis, “Aframework enabling spatial analysis of mobile traffic hot spots,” IEEEWireless Commun. Letters, vol. 3, no. 5, pp. 537-540, October 2014.

[33] NGMN Alliance, “Small cell backhaul requirements,” White paper, June2012.

[34] Y. Weng, W. Xiao, and L. Xie, “Total least squares method for robustsource localization in sensor networks using TDOA measurements,”Intern. Journal of Distributed Sensor Networks Volume 2011, Article ID172902, 8 pages, June 2011.

[35] P. Sasikumar and S. Khara, “K-means clustering in wireless sensor net-works,” in Proc. IEEE 4th Intern. Confer. on Computational Intelligenceand Commun. Networks (CICN 2012), pp. 140-144, November 2012.

[36] T. M. Kodinariya and P. R. Makwana, ‘Review on determining numberof Cluster in K-Means clustering,” Intern. Journal of Advanced Researchin Computer Science and Management Studies, vol. 1, no. 6, pp. 90-95,November 2013.

[37] L. Liu, Y.-H. Nam, and J. Zhang, “Proportional fair scheduling formulti-cell multi-user MIMO systems,” in Proc. 44th Annual Confer. Info.Sciences and Systems (CISS 2010), pp. 1-6, May 2010.

[38] F. Kelly, “Charging and rate control for elastic traffic,” European Trans.on Telecommun., vol. 8, no. 1, pp.33-37, 1997.

[39] T. Bu, L. Li, and R. Ramjee, “Generalized proportional fair schedulingin third generation wireless data networks,” in Proc. IEEE Intern. Confer.on Computer Commun. (INFOCOM 2006), pp. 1-12, April 2006.

[40] R. Kwan, C. Leung, and J. Zhang, “Resource allocation in an LTE cel-lular communication system,” in Proc. IEEE Intern. Confer. on Commun.(ICC 2009), pp. 1-5, June 2009.

[41] C. Guo, M. Sheng, X. Wang, and Y. Zhang, “Joint scheduling and asso-ciation for α-fairness network utility maximization in cellular networks,”in Proc. Personal, Indoor and Mobile Radio Commun. (PIMRC 2013),vol., no., pp. 1769-1773, September 2013.

[42] R. W. Heath, Jr., T. Wu, Y. H. Kwon, and A. C. K. Soong, “MultiuserMIMO in distributed antenna systems with out-of-cell interference,” IEEETrans. on Signal Processing, vol. 59, no. 10 pp. 4885-4899, October 2011.

[43] C. K. Au-Yeung and D. J. Love, “On the performance of random vectorquantization limited feedback beamforming in a MISO System,” IEEETrans. on Wireless Commun., vol. 6, no. 2, February 2007.

[44] M. Kang, Y. J. Sang, H. G. Hwang, H. Y. Lee, and K. S. Kim, “Per-formance analysis of proportional fair scheduling with partial feedbackinformation for multiuser multicarrier systems,” in Proc. IEEE Veh. Techn.Confer. Spring (VTC 2009), pp. 1-5, April 2009.

[45] http://cvxr.com/, 2012 CVX Research, Inc.[46] S. Burer and A. N. Letchford, “Non-convex mixed-integer nonlinear pro-

gramming: a survey,” Surveys in Operations Research and ManagementScience 17, pp. 97-106, 2012.

[47] S. A. Ramprashad, G. Caire, and H. C. Papadopoulos, “A joint schedul-ing and cell clustering scheme for MU-MIMO downlink with limitedcoordination,” in Proc. IEEE Intern. Confer. on Commun. (ICC 2010),pp. 1-6, May 2010.

[48] Y.-X. Yuan, “A review of trust region algorithms for optimization,” inProc. 4th intern. congress on industrial and applied mathematics (ICIAM1999), pp. 271-282, 1999.

[49] S. Boyd and L. Vandenberghe, “Convex optimization,” CambridgeUniversity Press, 716 pages, 2004.

[50] D. Coppersmith and S. Winograd, “Matrix multiplication via arithmeticprogressions,” Journal of Symbolic Computation, no. 9, pp. 251-280,1990.

[51] NOKIA, “LTE radio transport security,” White paper, pp. 1-27, June2015, available: http://resources.alcatel-lucent.com/asset/200321.

[52] H. Holma, A. Toskala, and J. Reunanen, “LTE small cell optimization:3GPP evolution to release 13,” John Wiley & Sons Ltd, 462 pages,November 2015.

[53] 3GPP TR 25.912 V9.0.0, “Universal mobile telecommunications system(UMTS); LTE; feasibility study for evolved universal terrestrial radioaccess (UTRA) and universal terrestrial radio access network (UTRAN),”October 2009.

[54] A. Farshad, M. K. Marina, and F. Garcia, “Urban WiFi characteriza-tion via mobile crowdsensing,” in Proc. IEEE Network Operation andManagement Symp. (NOMS 2014), pp. 1-9, May 2014.

[55] J. Mattingley and S. Boyd, “CVXGEN: a code generator for embeddedconvex optimization,” Optimization and Engineering, vol. 13, no. 1, pp.127, 2012.

[56] M. Tolstrup, “Indoor radio planning: A practical guide for 2G, 3G and4G,” John & Sons, Ltd., 610 pages, May 2015.

[57] 3GPP R1-130856, Evaluation assumptions for small cell enhancements-physical layer, Huawei, HiSilicon, February 2013.

[58] J. Colom Ikuno, M. Wrulich, and M. Rupp, “Performance and modelingof LTE H-ARQ,” in Proc. ITG International Workshop on Smart Antennas(WSA 2009), pp. 1-6, February 2009.

[59] J. Colom Ikuno, M. Wrulich, and M. Rupp, “System level simulation ofLTE networks,” in Proc. IEEE Veh. Techn. Confer. Spring (VTC 2010),pp. 1-5, May 2010.

[60] P. Ezzatti, E. Quintana-Orti, and A. Remon, “High performance matrixinversion on a multi-core platform with several GPUs,” in Proc. 19th Eu-romicro International Conference on Parallel, Distributed and Network-Based Processing (PDP 2011), pp. 87-93, February 2011.

[61] Intel Corporation, “Evolved Packet Core (EPC) for communicationsservice providers,” Solutions Reference Architecture revision 1.2, pp. 1-11, May 2016.

0018-9545 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2017.2748570, IEEETransactions on Vehicular Technology

LOAD BALANCING SCHEME WITH SMALL-CELL COOPERATION FOR CLUSTERED HETEROGENEOUS CELLULAR NETWORKS, J.-B. PARK AND K. S. KIM 17

Jin-Bae Park was born in Incheon, Korea, on 22June 1980. He received the B.S. and M.S.E. degreesin electrical and electronic engineering from YonseiUniversity, Seoul, South Korea, in 2006 and 2008,respectively. He is currently pursuing the Ph.D.degree at the Department of Electrical and Elec-tronic Engineering, Yonsei University. His researchinterests include heterogeneous cellular network, co-ordinated multi-cell processing, and radio resourcemanagement.

Kwang Soon Kim (S’95, M’99, SM’04) was bornin Seoul, South Korea, in 1972. He received theB.S. (summa cum laude), M.S.E., and Ph.D. degreesin electrical engineering from the Korea AdvancedInstitute of Science and Technology, Daejeon, SouthKorea, in 1994, 1996, and 1999, respectively.

From 1999 to 2000, he was with the Departmentof Electrical and Computer Engineering, Universityof California at San Diego, La Jolla, CA, USA, asa Post-Doctoral Researcher. From 2000 to 2004, hewas a Senior Member of the research staff with the

Mobile Telecommunication Research Laboratory, Electronics and Telecommu-nication Research Institute, Daejeon. Since 2004, he has been a Professor withthe Department of Electrical and Electronic Engineering, Yonsei University,Seoul. His research interests include signal processing, communication theory,information theory, and stochastic geometry applied to wireless heterogeneouscellular networks, wireless local area networks, wireless D2D networks andwireless ad doc networks, and are recently focused on the new radio accesstechnologies for 5G.

From 1999 to 2000, he was a Post-Doctoral Researcher with the Departmentof Electrical and Computer Engineering, University of California at SanDiego, La Jolla, CA, USA. From 2000 to 2004, he was a Senior Member ofthe Research Staff with the Mobile Telecommunication Research Laboratory,Electronics and Telecommunication Research Institute, Daejeon. Since 2004,he has been a Professor with the Department of Electrical and ElectronicEngineering, Yonsei University, Seoul. His research interests include signalprocessing, communication theory, information theory, stochastic geometryapplied to wireless heterogeneous cellular networks, wireless local areanetworks, wireless D2D networks and wireless ad hoc networks, and newradio access technologies for 5G.

Dr. Kim received the Post-Doctoral Fellowship from the Korea Scienceand Engineering Foundation in 1999, the Outstanding Researcher Award fromthe Electronics and Telecommunication Research Institute in 2002, the JackNeubauer Memorial Award (Best System Paper Award, the IEEE TRANS-ACTIONS ON VEHICULAR TECHNOLOGY) from the IEEE VehicularTechnology Society in 2008, and the LG Research and Development Award:Industry-Academic Cooperation Prize, LG Electronics, in 2013. He served asan Editor of the Korean Institute of Communications and Information Sciences(KICS) from 2006 to 2012, and the IEEE TRANSACTIONS ON WIRELESSCOMMUNICATIONS from 2009 to 2014. He has been an Editor of theJournal of Communications and Networks since 2008. He has been Editor-in-Chief of the KICS since 2013.