Low-Power SDR Design on an FPGA for Intersatellite ...users.wpi.edu/~xhuang/pubs/2018_cai_tvlsi.pdfpaper presents a complete software-deﬁned radio (SDR) model for intersatellite

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 26, NO. 11, NOVEMBER 2018 2419

Low-Power SDR Design on an FPGA forIntersatellite Communications

Xin Cai, Mingda Zhou , Tian Xia, Senior Member, IEEE, Wai H. Fong, Wing-Tsz Lee,

and Xinming Huang , Senior Member, IEEE

Abstract— Small satellite systems make space missions forcommunication, navigation, and scientific research more real-izable and diversified. Small satellites flying in large clus-ter or constellation formation as a network can provide aneconomical access to accomplish more complex missions, suchas distributed computation, high-resolution imaging, and space-craft maintenance. An increasing number of satellites operatingon lower earth orbit for complex missions require a wirelesscommunication system that is both reliable and flexible. Thispaper presents a complete software-defined radio (SDR) modelfor intersatellite communications (ISCs) and its implementationon a field-programmable gate array (FPGA). The proposed SDRfor transmitter and receiver only has a power consumptionof 2.1 and 3.2 W, respectively, which is suitable for power-limitedsmall satellite systems. Algorithms and parameters of each blockare optimized aiming at reducing hardware resource utilization.A low-density parity-check code constructed by the Euclideangeometry method is adopted as the channel code for forwarderror correction. Implementations of the synchronization, demod-ulation, and decoding algorithms are optimized for hardwareefficiency. The low-power SDR designs are implemented on anFPGA-based experimental platform and successful demonstratedby over-the-air transmissions.

Index Terms— Field-programmable gate array (FPGA),intersatellite communications (ISCs), low-density parity-check(LDPC) code, software-defined radio (SDR), synchronization.

I. INTRODUCTION

SMALL spacecraft has attracted many interests for launch-ing scientific missions at or beyond lower earth orbit

in recent years. Small satellites are becoming an importantpart in small spacecraft missions owing to their advantagesof low cost, small weight, short lead time, and miniaturesize [1]. Hundreds of small satellites flying together are ableto accomplish more missions, e.g., higher precision naviga-tion, formation control, data exchange, information processing,

Manuscript received January 23, 2018; revised May 10, 2018 andJune 19, 2018; accepted June 21, 2018. Date of publication July 18, 2018;date of current version October 23, 2018. This work was supported in partby the U.S. National Science Foundation under Grant 1401121 and Grant1414250 and in part by the National Aeronautics and Space Administrationunder Grant NNX16AB06A. (Corresponding author: Xinming Huang.)

X. Cai is with the School of Telecommunication Engineering, XidianUniversity, Xi’an 710071, China.

M. Zhou and X. Huang are with the Worcester Polytechnic Institute,Worcester, MA 01609 USA (e-mail: [email protected]).

T. Xia is with The University of Vermont, Burlington, VT 05405 USA.W. H. Fong and W.-T. Lee are with the Goddard Space Flight Center,

Greenbelt, MD 20771 USA.Color versions of one or more of the figures in this paper are available

online at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/TVLSI.2018.2850746

and spacecraft maintenance [2]. To integrate the process-ing capability, software and hardware resources of a singlesatellite, and to keep a group of formation-flying satelliteswork efficiently, all the satellites have to communicate witheach other. Thus, intersatellite communications (ISCs) withreal-time processing are required to make a large number ofsmall satellites that work as a network [3], [4]. A communi-cation system design with high data rate, low bit error rate(BER), and real-time transmission is a prerequisite for anefficient ISC.

The open system interconnection (OSI) and the derivativemodels are usually adopted to design the ISCs’ structure. Fre-quency allocation, data rate, modulation, code schemes, andplatform are the main considerations in the OSI physical layer.The choice of a frequency band for ISCs is limited by severalaspects: spectrum allocated by the International Telecommu-nication Union, the available hardware, power supply, antennasize, and mission requirements. Typically, frequency allocationcandidates for ISC include S-, K u-, and K a-bands [5], [6].It is also known that a higher frequency for ISC not onlyhelps to reduce the antenna size and mass of the transceiver butalso makes a wider bandwidth available [7]. Binary phase-shiftkeying (BPSK), quadrature phase-shift keying (QPSK), andoffset QPSK (OQPSK) are preferred in the ISC. A coherentBPSK system needs less power at the same BER level andprovides higher sensitivity for communication and tracking.QPSK can enhance the spectral efficiency by carrying 2-bitinformation in each symbol, and OQPSK overcomes a suddenphase change problem in QPSK.

There are several research works on ISC in the existingliterature, mostly focused on theoretical studies and numericalsimulations. A K a-band test bed was developed in [8], anda rate 1/2 convolution code with a constraint length of 7 wasimplemented with the OQPSK modulation in the downlink andthe BPSK modulation in the uplink. However, their test resultshave shown that this test bed in a convolution code schemecould only achieve a BER of 10−4 at 5-dB bit energy-to-noisevariance (Eb/N0). In [9], a rate 1/2 low-density parity-check(LDPC) (512, 256) code was adopted to improve the BERperformance. They demonstrated that this LDPC code had abetter error correction capability than the convolution code.However, as many as 50 iterations were needed in the LDPCdecoder.

Edmonson et al. [10] presented an integration of responsiveand formal design (RFD) with an OSI framework for ISC.

1063-8210 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

https://orcid.org/0000-0002-3410-5471

https://orcid.org/0000-0003-0584-3448

2420 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 26, NO. 11, NOVEMBER 2018

Different levels of design abstraction for the RFD process areintroduced. In [11], several methods were presented to improvea CPU and field-programmable gate array (FPGA) perfor-mance for satellite applications, and also the study describedsome potential software-defined radio (SDR) architectures tosupport signals for distributed satellite systems, but no designdetails for ISCs were illustrated. Based on the structure ofsmall satellite networks (SSNs), Zhou et al. introduced thecontact plan design problems [12]. The study was focusedon SSNs to characterize the resources and to maximize thenetwork data volume.

The SDR platform has been widely used in ISCs for itsunique capabilities: technological flexibility, adaptive and mul-timode operations, programmable reconfiguration, and con-venient upgrade [13], [14]. These capabilities allow the SDRplatform to remotely change or upgrade the features of a radioto adapt to a dynamic environment and meet new requirements.CubeSat SDR can implement several communication protocolsand support data rate of 10 Mb/s [15]. Researchers at theNational Aeronautics and Space Administration (NASA) usedthe spectrum SDR-3000 platform as a test bed to simulateand demonstrate the inner satellites crosslink [16]. In [6],Universal Software Radio Peripheral N210 was employed withits transceiver operating in the frequency range from 400 MHzto 4.4 GHz. An AITech s950 single-board computer was inte-grated into the Harris SDR [8]. In [17], an FPGA-based SDRarchitecture was designed for the FUNcube-1(AO-73) CubeSatbeacon signals on a platform with AD-FMCOMMS3 radiofront end and ZedBoard FPGA back end.

In this paper, an SDR model for ISCs is designedand also implemented on the Zynq system-on-chip (SoC)FPGA-based SDR platform. The model is aimed at exploringan FPGA-based SDR architecture to provide an efficient datalink for ISCs. The system model includes the entire ISCscomponents: coding, OQPSK modulation, and pulse shapingin the transmitter and automatic gain control (AGC), frequencycompensation, timing recovery, frame synchronization, softdemodulation, and decoding in the receiver. The main con-tributions of this paper can be summarized as follows.

1) A complete SDR model and a system design for ISCsare proposed with performance simulation results. Theproposed SDR architecture is model-based designedthrough using multiple tools and programming lan-guages. Each individual component can be easilyupdated or replaced using a new module accordingto the targeted communication standard. The FPGAimplementations of the transmitter and the receiver onlyconsume the power of 2.1 and 3.2 W, respectively.

2) Different from the conventional LDPC codes encodingmethods, we propose a new encoding scheme aim-ing at saving more hardware resources. Furthermore,an improved encoding algorithm and the correspondingencoder structure are also proposed. The improved algo-rithm helps to save hardware resources significantly bysharing the parity check bit generator.

3) A high code rate LDPC (255, 175) code is constructedwith the Euclidean geometry method, which improvesthe BER performance significantly compared with the

convolution code used in the Harris test bed and theLDPC code in [9]. A simplified LDPC decoder witha fully parallel structure is designed, in which an effi-cient three-level comparison tree is proposed to realizethe check nodes (CNs) update. We also apply sev-eral optimization techniques that reduce the hardwareusage by two thirds with only very small performancedegradation.

4) A simplified magnitude detection scheme is proposed,and the optimal combination coefficients are foundto reduce the error introduced by the simplification.An improved Luise algorithm is adopted for the coarsefrequency compensation (CFC) aiming at reducing thecomputation complexities. An over-the-air transmissionin the S-band is demonstrated, and a signal quality ismeasured during the real-time transmissions.

The remaining sections of this paper are organized asfollows. The features of intersatellite links and some consid-erations for ISCs are discussed in Section II. Design strategiesand related algorithms are illustrated in Section III. The struc-ture of transmitter design is revealed in Section IV. Section Villustrates the implementation details of the receiver. Theprocess of targeting the transceiver models on the SDRplatform, implementation results, and so on are discussed inSection VI. Conclusions are made in Section VII.

II. FEATURES OF INTERSATELLITE COMMUNICATION

LINKS AND DESIGN CONSIDERATIONS

This paper is aimed to design efficient ISCs links for agroup of small satellites flying in cluster formation, in which100∼150 small satellites are deployed in a limited sphericalspace with a diameter of 100 m at an altitude of 500 kmabove the earth. These satellites are expected to be only oflaptop size with the total mass less than 5 kg and the totalpower less than 10 W. Small latency is required so that ISCslinks can be responded in a short time. It is known from [1]that the rechargeable battery system occupies about one-thirdweight of the entire satellite and the battery cell energy densityranges from 120 to 250 Wh/kg. For an individual satelliteweighted less than 5 kg, the battery system cannot be heavierthan 1.6 kg, which can provide a limited energy of about192∼400 Wh. Compared with traditional wireless communi-cation systems, the ISCs system has stringent limitations onsize, weight, and power. Each module in the system shouldbe optimized for hardware and energy efficiency. Many issuesneed to be considered, including the limitation of modulationtypes, the choice of error correction code, the AGC to over-come signal fading, the compensation for Doppler frequencyshift, and radiation-hardening devices.

When engineers design a satellite system, high-powertraveling-wave tubes (TWTs) and solid-state power amplifiersare often utilized. To obtain the maximal amplification effi-ciency and maximize the output power, TWTs and solid-statepower amplifiers are usually designed to work at or beyond thecompression point, which may bring distortion to the originalsignals. The signals around the outer states of a constellationare more susceptible to distortion, since they need more powerto transmit. The distortion caused by the amplifiers drives

CAI et al.: LOW-POWER SDR DESIGN ON AN FPGA FOR INTERSATELLITE COMMUNICATIONS 2421

the received signals around the states to move toward theinner rings of a constellation, resulting in much difficultyduring decoding. Therefore, the modulation schemes for smallsatellites are often very limited. Typical modulation candidatesfor ISCs include BPSK, QPSK, OQPSK, and amplitude phaseshift keying

The power limitation also has an effect on the choice ofcode length. A decoder for a longer code often requires morehardware resources, resulting in higher power consumption.In addition, a longer code may also increase the decodinglatency. Therefore, a shorter code is often selected for ISCs.NASA’s first K a-band transceiver implemented on Harris’sSDR adopted a rate 1/2 convolution code with a constraintlength of 7 [8]. In this paper, the (255, 175) LDPC code isconstructed, which has much better performance than the con-volution code in Harris’s SDR and also meets the requirementsof small decoding latency.

Free-space loss refers to the energy loss of electromag-netic wave when it travels in the air. An electromagneticwave mainly travels in a free space beyond the atmosphereand the free-space loss is proportional to the square ofcarrier frequency and the square of the distance betweentwo satellites [18]. The channel model for ISCs is mainlydetermined by the free-space loss and thermal noise from theelectronics so that the channel can be assumed as additivewhite Gaussian noise [19]. To overcome the potential signalfading, an AGC module is designed to adjust the amplitude ofthe received signals automatically.

The Doppler effects in an orbiting cluster are expectedto be insignificant, since all the satellites fly in a limitedspherical space and the relative velocity between the satellitesis small. The Doppler effect is dominated by the line ofsight velocity differences. For intersatellite links working inthe K a-band, the maximum Doppler shift frequency is arounda few hundreds of hertz. The Doppler shift frequency is smallerfor satellites that operate in a lower frequency band. However,a frequency shift detector and a compensator are still needed.In this paper, a two-step frequency compensator is designedto compensate the frequency shift caused by the transmissionand Doppler effect.

Single-event upset introduced by a cosmic ray has aneffect on most commercial off-the-shelf devices. Semicon-ductor components easily suffer from the radiation damage.This drives many manufacturers to develop radiation-tolerancedevices. Xilinx has already released a series of rad-hard andrad-tolerant FPGAs. These rad-tolerant FPGAs are robustlydesigned to meet the reliability requirements in space envi-ronment. The reconfigurability of an FPGA allows the mod-ifications at any time and upgrades of system after launchwithout any hardware replacement, which greatly reduces thecost and risk of a space mission.

III. DESIGN STRATEGIES AND RELATED ALGORITHMS

The specification of laptop-size satellites determines thatthey must have these special features: low weight, extremelylimited hardware resources, and limited power. Despite theselimitations, a reliable link is still needed. A qualified trans-ceiver for ISCs must meet these requirements very well.

Fig. 1. Example of the encoding process.

A. Resources-Saving Strategies

For the same BER performance, the signal-to-noise ratiorequirement can be significantly reduced with forward errorcorrection (FEC) coding, which also leads to lower trans-mission power. An LDPC code is widely used in the latestcommunication standards, since its capacity can approach theShannon limit quite closely.

In small satellite communications, a codeword with somespecial requirements is needed: a code rate is higher than1/2 and the size of information block or code block should bean integer multiples of 8 or 16 [6] so that information blockcan be integral multiple of a byte. In this paper, a 2-D type-I(0, 4)th-order Euclidean geometry LDPC code with 175 infor-mation bits and 255 code bits, denoted as EG-LDPC(255, 175), is adopted [20]. Considering the OQPSK mod-ulation adopted in this model, an extra bit 0 is added, whichresults in the code length of 256.

An (n, k) LDPC code can be defined by its parity checkmatrix H. The H matrix of the EG-LDPC (255, 175) is of size255 × 255. The fact that the dimension of this matrix is only80 makes it possible to simplify the matrix H. The 80 rowsthat are linearly independent can be found. Thus, the 80 rowscan form a new parity check matrix H�.

A traditional way to obtain the systematic codeword is toconvert H into the form of [Hd, Hp] (denoted as Hl ), whereHp is a lower triangular matrix.

In this paper, we change H� into another special form(denoted as Hd ) to implement the encoding so that some ben-efits and different characters can be obtained. The submatrixHp is a “dual-diagonal” matrix, which can be expressed as

Hp =

⎛⎜⎜⎜⎝

11 1

. . .. . .

1 1

⎞⎟⎟⎟⎠. (1)

The encoding can be realized by implementing the followingrecursion (set p0 = 0):

p j = p j−1 ⊕(

⊕i∈B j

di

), j = 1, 2, . . . , (n − k) (2)

where di is the i th information bit, p j is the j th parity checkbit, and B j = {i : Hd

j,i = 1}. An example of the encodingprocess is shown in Fig. 1.

The benefits of encoding based on this “dual-diagonal” formcan be illustrated from three aspects. First, the codes that aredefined by Hd (Hp is of the “dual-diagonal” form) constitute


a subclass of LDPC codes. There are many LDPC codes, suchas repeat accumulate (RA) codes [21], semirandom LDPC(SRLDPC) [22], and concatenated tree (CT) codes [23], whichhave the same Hp matrix. Thus, this subclass of an LDPCcode may have the same features with RA, SRLDPC, and CTcodes. Second, the “dual-diagonal” form makes the hardwareimplementation more flexible and contributes to saving hard-ware resources. When the parity check bit p j is calculated,designers only need the value of the former parity checkbit p j−1 and the values of the related information nodes(information nodes between two black nodes p j−1 and p j

in Fig. 1). In addition, there are only two “1” in each rowof Hp except the first row. Compared with traditional Hl

with a lower triangular matrix, fewer XOR operations arerequired when the parity check bits are calculated. Thus, fewerXOR units are needed in hardware implementation, whichhelps to save hardware resources. Finally, fewer computationsare required in the encoding process. There are (n2 − k2)/2computations in the traditional encoding method. However,there are at most (k + 2)(n − k) computations needed in theencoding process based on Hd .

Although the improvements from the above-describedencoding procedure help to reduce the XOR operations, thereare still 80 groups of XOR operations needed when theencoder for EG-LDPC (255, 175) is implemented. Motivatedby the module-sharing technology, we further propose a newarchitecture in which the parity check bit generator is shared.There is only one group of XOR operation required in thenew structure. The details of the structure will be illustratedin Section IV.

Since the H matrix of this code is of size 255 × 255, typicaldecoding algorithms, such as sum-product algorithm (SPA)and min-sum algorithm (MSA), indicate that a conventionaldecoder with a fully parallel structure requires 255 check noteprocessors (CNPs) and 255 variable node processors (VNPs).To reduce the hardware resources, a simplified decoder is pro-posed and implemented. The proposed decoder is implementedbased on the new parity check matrix H� so that only 80 CNPsand 255 VNPs are required, which results in fewer usageof 175 CNPs.

Before the radio-frequency (RF) signals are fed into FPGA,an AGC module is necessary to regulate the amplitude ofsignals. The amplitude detection is involved in the square rootoperations. Conventional methods mainly adopt the intellectualproperty (IP) cores, lookup table (LUT) or coordinate rotationdigital computer (CORDIC) algorithm. Since the complexsignals of RF modules entering the FPGA are distributed ina limited area around the constellation points, an alternativesimplification method is proposed, in which the amplitudeis estimated by the linear combination of the in-phase andquadrature components and only one adder is required.

B. Building Reliable Links’ Strategies

Reliable links are required to ensure the communicationsamong satellites. The receiver should have good capabilitiesof correcting the frequency offset and the timing error andfinding the frame header.

Fig. 2. Performance comparison of two frequency compensation schemes.

A frequency compensation module is needed to correct thefrequency offset introduced by the transmission and Dopplereffect. This module is typically realized by two methods: finefrequency compensation (FFC) and CFC followed by FFC.Although FFC may require fewer hardware resources, it onlyprovides reliable compensation within a very narrow range offrequency offset. The adoption of CFC followed by the FFCcan ensure a low BER for a wider range of frequency offset.The performance comparison of the two schemes is shownin Fig. 2. The comparison indicates that two-step frequencycompensation (CFC followed by the FFC) is able to handle awide range of frequency offset.

In this paper, a two-step frequency compensator is designedto compensate the signal frequency shift and Doppler fre-quency shift. The frequency compensator is implementedthrough obtaining the information from the fourth power ofthe received signal. This helps to save hardware resourcescompared with the fast Fourier transform (FFT)-based method.The mathematical expression of the received signals can bee j (kπ/2+� f t+�φ), where k is an integer that ranges from 0 to 3,� f is the frequency offset, and �φ is the phase offset. In thefirst stage, an CFC module estimates the frequency offset� f �. In the next stage, an FFC module estimates the residualfrequency offset � f �� and the phase offset �φ�. Once � f �,� f ��, and �φ� are obtained, the signals can be recovered ase j (kπ/2+� f t+�φ−� f �t−� f ��t−�φ�).

The precise symbol timing must be known in order todemodulate the symbols. In this paper, a zero-crossing timingerror detector (ZCTED) is adopted to calculate the timingerror and the half symbol timing offset introduced in theOQPSK modulation. The ZCTED is designed by finding thezero crossings in the eye diagram [24]. The timing error isfed into a loop filter and then goes through the interpolationcontrol block. The interpolation control block calculates thetiming difference between the actual sampling instant and theoptimal instant. Interpolants are generated by the interpolationfilter at the optimum sampling instants.

The frame synchronization module is used to find thestart of a frame and locate the beginning of the payload.A frame synchronization module that has low false-detectionand misdetection probability is required. Typically, a pilot is


adopted to facilitate the header detection. In this paper, a 13-bit Barker code is used as the pilot, since this Barker code is amaximum autocorrelation sequence with sidelobes not largerthan one [25]. The optimal autocorrelation properties of theBarker code enable the system to find the header of a framewith high confidence.

C. Related Algorithms

Since the new matrix Hd is adopted to conduct the encoding,some modifications are made compared with the traditionalencoding algorithm. The pseudocode for the entire encodingprocedure is given in Algorithm 1, which also gives furtherexplanation to 2.

Algorithm 1 Encoder Based on Hd

Input: information bits vector: (d1, d2, . . . dk) and Hd

Output: codeword: C1 allocate two pieces of storage D and P with the length of

k and (n − k) respectively, set D = 0, P = 0;2 l=1, m=1, Bm = {i : Hd

m,i = 1};3 for l = 1 to n do4 if l <= k then5 D[l] = dl ;6 C[l] = dl ;

7 else

8 P[m] = P[m − 1] ⊕(

⊕i∈Bm

Di

);

9 m = m + 1;10 update Bm ;11 C[l] = P[m];

Although fewer XOR operations are needed in this algo-rithm, it can be found that (n−k) groups of XOR operations arestill required. Thus, we further improve the encoding algorithmto reduce the group numbers of XOR operations. Assume thatTemp is a bit vector with the length of b and F(Temp, b) isa function that returns the XOR value of the b elements in thevector Temp. The improved encoding procedure is illustratedin Algorithm 2.

In the improved algorithm, a group of XOR operationsF(Temp, dmax) is shared for (n − k) times. When a certainparity check bit is generated, designers only need to updatethe Temp vector. The updating process of the vector Temp isequal to finding all the bits in the set {di : i ∈ Bm}.

The SPA has a powerful capability of correcting the biterrors for LDPC codes. However, a large amount of complexcomputations prevents the adoption of SPA in hardware imple-mentation. The MSA is an ideal replacement to simplify theSPA due to the linear complexity and easier implementationof the CNs’ update. The major operation of the CNs’ updateprocess in the MSA is equal to finding the minimal value fromall the neighbors of a CN. However, successive comparisonsto obtain the minimal value require a lot of clock cycles,which also results in long decoding latency. This must beavoided in the ISCs. Considering the new matrix H� of this

Algorithm 2 Improved Encoder Based on Hd

Input: information bits vector: (d1, d2, . . . dk) and Hd

Output: codeword: C1 allocate two pieces of storage D and P with the length of

k and (n − k) respectively, set D = 0, P = 0;2 find the maximal row degree dmax of Hd ;3 allocate a piece of storage Temp with the length of dmax ,

set Temp = 0;4 l=1, m=1, Bm = {i : Hd

m,i = 1};5 for l = 1 to n do6 if l <= k then7 D[l] = dl ;8 C[l] = dl ;

9 else10 Temp = {Di : i ∈ Bm};11 goto line 12;

12 P[m] = P[m − 1] ⊕ (F(Temp, dmax));13 C[l] = P[m];14 Temp = 0;15 m = m + 1;16 update Bm ;17 goto line 5;

EG-LDPC (255, 175) code still has a regular row degreeof 16, we designed a three-level comparison tree that canfind the minimal and second minimal values in only oneclock cycle. More details about the MSA are introducedin [26].

In the frequency compensation module, FFT-based andcorrelation-based algorithms (also known as the Luisealgorithm [27]) are the common methods to estimate thefrequency offset. The FFT-based algorithm is suitablefor all modulation types and the latter for PAK andpulse-amplitude modulation. Compared with the FFT-basedalgorithm, the Luise algorithm requires fewer hardwareresources. In the Luise algorithm, there are M registerweighted by 1/(N − i), i = 1, 2, . . . M , where N is thenumber of samples. In this paper, N is set to be 512 tofacilitate the bit shift operation and M is chosen as 3.Three coefficients (1/511, 1/510, and 1/509) are approx-imated as 1/512. Thus, the corresponding three multipli-cations can be replaced with the only one multiplicationwith 1/512 which can be implemented with a 9-bit shiftoperation.

IV. TRANSMITTER MODEL DESIGN

The SDR structure of the transmitter system model is shownin Fig. 3, in which an FPGA SoC (Zynq 7020) is deployedas the baseband processor to generate baseband signals andan RF transceiver (AD-FCOMMS3-EBZ) is used as the frontend to transmit an RF waveform.

As shown in Fig. 3, the transmitter model mainly consistsof an onboard computer/PC, an LDPC encoder, an OQPSKsymbol mapping block, a pulse shaping filter, a controller, andan RF transceiver. Among these baseband blocks, the LDPC


Fig. 3. SDR architecture of the transmitter system model.

encoder and the OQPSK symbol mapping are implementedwith the “black box” tools in the HDL Coder [28], while othermodels are implemented by building Simulink blocks. Blackbox provides an efficient way to embed the customized HDLcode, such as Verilog, in the Simulink environment. All thesemodules are designed and optimized in the HDL Coder fortargeting to an FPGA [29].

Data bits are generated by the onboard computer/PC andthen sent to the FPGA through the Ethernet interface. Theinput signal to the FPGA is a 32-bit bus, and only the data bitsare extracted to go through the LDPC encoder. Subsequently,the coded bits are mapped into OQPSK symbols and then gothrough a square root raised cosine filter. The pulse shapingsignals are upconverted to an RF by a carrier wave at 2.4 GHzthrough the AD-FCOMMS3-EBZ radio board. Finally, the RFsignals are transmitted through the antenna.

A. LDPC Encoder

The H matrix of EG-LDPC (255, 175) can be generatedby taking the incidence vector of a line in EG (2, 24) andcyclically shifting the incidence vector [20]. Such a construc-tion method not only generates a code with a large minimumdistance (distance of 17) but also boosts the convergence ofthe decoding process. The large minimum distance helps thedecoder to lower the error floor and improve the BER per-formance. Since the proposed LDPC code converges quickly,only a small number of decoding iterations are needed to reachthe required BER performance. This will be demonstratedin Section V-B. The H matrix of this code is a squarematrix of size 255 × 255 with both row and column degreesof 16.

Fig. 4 shows the structure of the LDPC encoder, which iscomprised of a finite-state machine (FSM) controller, a bitbuffer, information bits selector and a parity check bit gen-erator. This encoder is implemented based on the matrix Hd .When a new frame comes, both the pilot bits and informationbits directly go through the encoder. At the same time,the information bits are stored in the bits buffer. When theparity check bit p j is calculated, the related information bits

TABLE I

RESOURCES COMPARISON OF THE ENCODER IN ALGORITHMS 1 AND 2

{di : i ∈ B j } are selected in the information bits selector.Simultaneously, these selected bits and the parity check bitp j−1 are fed into the parity check bit generator. The paritycheck bit generator is a module that has dmax + 1 ports andreturns the XOR value of all the input bits. If the number ofthe selected bits is less than dmax, values on the other portwill be set to 0. Since the oversample factor is configured asfour, there are four clock cycles in each symbol. A half ratefrequency divider is needed and configured as the input clockof this encoder so that only two bits are generated in onesymbol period. The LDPC encoder outputs 2 bits per symbolperiod that are used as the input of OQPSK modulator.

It can be found that there is only one group of XOR

operations required in this encoder and this group of XOR

is shared for 80 times. The information bits selector, alsoan FSM controller, determines which information bits areselected in the corresponding clock period. To verify howmany resources the encoder described in Algorithm 2 can save,hardware resources of encoders designed in the two algorithmsare compared. The resource utilization of the two encodersis summarized in Table I. The comparison indicates thatAlgorithm 2 helps to save hardware resources significantly.Although Algorithm 1 requires much more resources, it canbe adopted in the scenario that needs high throughout sincethe parallel use of (n − k) groups of XOR enables (n − k)parity check bits to be generated simultaneously.

V. RECEIVER MODEL DESIGN

This section will describe the receiver system model andhardware considerations of designing a receiving link. In thereceiver, an RF transceiver SoC (AD-FCOMMS3-EBZ RFtransceiver) is deployed as the front end to receive RF signalsand an FPGA SoC as the baseband processor to process thebaseband signals. The SDR structure of the receiver modelis shown in Fig. 5. As shown in Fig. 5, RF signals aredownconverted to baseband signals in the RF transceiverand then fed into the FPGA under the controllers’ control.The baseband processor is mainly consisted of seven com-ponents: AGC, matched filter, frequency compensation, tim-ing recovery, frame synchronization, soft demodulation, andLDPC decoding. Among these modules, soft demodulationand LDPC decoding blocks are implemented using HDLtoolbox. A general description of these blocks is given asfollows.

The AGC block is designed to regulate the received signalstrength and keep the receiver working within the normaloperating range aiming at overcoming the potential signalfading and sudden increase in signals’ power [30]. The AGCalgorithm boosts the gain if the received signal has weakamplitude, whereas the gain is attenuated to avoid signal


Fig. 4. Structure of the LDPC encoder in transmitter.

Fig. 5. SDR architecture of the receiver system model.

clipping and nonlinear degradation if the received signal hasstrong amplitude. With the AGC algorithm, the average peakamplitude is dynamically adjusted to a stable value, which isnecessary for the subsequent modules.

The frequency estimation comprises two stages: CFC andFFC. The CFC performs a rough order estimate on the fre-quency error between the expected frequency and the observedfrequency from the received signals. The FFC provides a moreprecise estimation of the remaining small frequency error.

The header of the frame is found in the frame syn-chronization block so that the payload symbols can belocated. Log-likelihood ratio (LLR) values are calculated ina soft-decision modulator. Once the decoded bits and validsignal are available, they are packed into a pair and sent out.After the decoded bits and valid signal are separated in thehost computer, the decoded bits are converted into readablemessage and printed on the screen.

Fig. 6. Improved scheme to detect signal’s magnitude.

A. Simplification of Magnitude Detection in AGC

AGC is typically designed with a closed-loop feedbackstructure that mainly includes a decision feedback equal-izer (DFE) and a magnitude detection block. In this AGCmodule, the loop gain and the reference value are set to0.1 and 0.5, respectively. The DFE generates a feedbackvalue according to the magnitude of the input signal, loopgain, and reference value. This feedback value is dynamicallycalculated and acts as an amplifier or attenuator. The detectionof signal’s magnitude is involved in the square and rootsquare calculations. Conventional methods to detect the signalinclude the adoption of IP core or CORDIC algorithms. In thispaper, an improved scheme to detect signal’s magnitude isproposed. The implementation structure is shown in Fig. 6.The magnitude detection block gives an approximate value ofthe amplitude by implementing the following operation [31]:

√I 2 + Q2 ≈ |L| + λ ∗ |S| (3)

where L = max(I, Q), S = min(I, Q), I and Q are the ampli-tudes of in-phase and quadrature components, respectively, andλ is a positive number between 0 and 1. The approximationsuggests that the calculation of a signal’s amplitude can berealized by the linear combination of amplitudes of in-phaseand quadrature components.

We ought to find an optimal λ value that can minimize theerror caused by the approximation. λ is enumerated with a stepsize of 1/27 to find its optimal value. For each λ, 10 000 I andQ samples are random values within the range [−10, 10]. Theactual amplitude (I 2 + Q2)1/2 and the approximate amplitude


Fig. 7. Find the optimal value of λ.

(|L|+λ∗|S|) of these samples are calculated. The mean squareerror (mse) of the approximation can also be obtained. Fig. 7shows the mse at difference λ, which indicates that the optimalλ value equals 0.3125.

B. Simplified LDPC Decoder

As described in Section IV-A, the H matrix of EG-LDPC(255, 175) is of size 255 × 255. A total of 255 CNPs,255 VNPs, and 8160 (16 × 255 × 2) register units areneeded in order to design a decoder with a fully parallelarchitecture. A decoder with 255 CNPs may bring 0.4-dB gainat the BER of 10−7 compared with the one with 80 CNPs;however, it requires two times more hardware resources. In thispaper, a simplified LDPC coder based on the matrix H� isdesigned to save hardware resources. The new matrix H� canbe represented by a Tanner graph [26]. There are 80 CNsand 255 VNs in the graph, which correspond with 80 CNPsand 255 VNPs. Each row of H� has a regular row degreeof 16; however, each column has an irregular column degree.Suppose that the j th column of H� has a degree of d j .Accordingly, each CN is connected to 16 VNs and the j thVN is connected to d j CNs.

The structure of the simplified LDPC decoder is shownin Fig. 8, which includes an FSM controller, LLR buffer,channel information LLR register, vtc registers, ctv registers,80 CNPs, 255 VNPs, and bits buffer. vtc registers store themessage that VNPs updated and ctv registers store the messagethat CNPs updated. Since the new matrix H� has 80 rowsand each row has a regular degree of 16, there are only1280 (16 × 80) units needed in the ctv and vtc registers,respectively. The decoding processes can be described as foursteps: buffer the LLR values, CNs and VNs alternativelyupdate, make hard decision, and send the decoded bits out.

First, an LLR start signal is activated when LLR valuesare received from the modulator. Once the LLR valid signal ishigh, every two LLR values are stored into the LLR buffers ineach symbol period. The FSM controller distributes each pairof LLR values to the corresponding buffer. It takes 128 symbolperiods to store a frame of LLR values. Once a whole frame

of LLR values are buffered, the values in the LLR buffer aremoved into the channel information LLR register. Then, CNPsand VNPs begin to work. Second, 80 CNPs simultaneouslyread data from the vtc registers and then update values andstore the updated values into the ctv registers, which take twosymbol periods. After CNPs’ operations, 255 VNPs work in asimilar way: read data from the ctv registers and then updatevalues and store the updated values into the vtc registers,which take two symbol periods as well. A total of foursymbol periods are needed to finish one iteration. Usually,more iterations may result in a better performance but alsolower the throughput. To make a compromise, CNPs and VNPsalternatively perform read-and-update operations for 20 times.Third, when VNPs finish the 20th iteration, hard decision ismade based on the summation of channel information andthe updated values from all the CNPs. Then, VNPs send thewhole frame of bits to the bits buffer and simultaneously theFSM controller generates a valid signal, which indicates thatdecoded bits are available. Finally, every two decoded bits aresent out from the bits buffer in each symbol period.

During the CNs’ update processes, there are 80 CNPsworking simultaneously. Each CNP reads 16 values from thevtc registers sequentially. Here, the corresponding 16 positionsare only decided by the positions of “1” in the row ofH� [32]. In hardware implementation, the main operation ofCNs’ update is to find the minimum and second minimumabsolute values among these 16 values. An efficient way tofind these two values is implementing a three-level comparatortree, as shown in Fig. 9 [33], [34]. There are seven same unitsduplicated, and each one is used to find the minimum andsubminimum values from the four inputs. The comparator treeis able to find these values within one clock cycle. When theminimum and the subminimum are found, both are scaled by0.75 [26].

Similarly, there are 255 VNPs working simultaneouslyduring the VNs’ update processes. The j th VNP only reads d j

values from the corresponding d j positions of the ctv registers.Here, the corresponding d j positions are only decided by thepositions of “1” in the j th column of H� [32]. The values to beupdated are simply the summation of the channel informationand the (d j − 1) values that are selected from d j values readby j th VNP [26]. More details of CNs’ and VNs’ updateprocesses can be found in [26].

The modified min-sum decoder is implemented with a fixedpoint data representation. The reduction of bit width maylead to reduction of area, power consumption, and hardwareresources. However, this may also bring degradation of thedecoding performance. It is necessary to find an efficient fixedpoint scheme that uses bit width as small as possible andkeeps an acceptable performance. The BER versus Eb/N0for different simulation schemes is shown in Fig. 10; 300-biterrors are collected in each Eb/N0. The dashed lines show thesimulation results in a different fixed point precision, and thesolid lines show the float precision performances for differentnumbers of iterations. The solid line with pentagram shows theperformance of the decoder that is designed with 255 CNPsand 255 VNPs. Let (i, f ) denote the fixed point simulationscheme with the total of (1 + i + f ) bits, where 1 bit is used


Fig. 8. Simplified LDPC decoder with a fully parallel structure.

Fig. 9. Structure of the comparator to find the minimum and second minimumvalues.

for sign bit, i bits for integer part, and f bits for fractional partof the LLR value. The simulation results show that this codeusing 7 bits (1 bit for sign, 4 bits for the integer part, and 2 bitsfor fractional part) or 6 bits (1 bit for sign, 4 bits for the integerpart, and 1 bits for fractional part) representation easily sufferfrom a higher error floor. Considering the hardware resourcesand decoding performance, (5, 1) is a good choice to representthis code. It is shown that decoding with 50 iterations onlyimproves the performance by 0.1 dB at a BER of 10−6 than theone with 10 iterations and almost has the same performancewith the one in 20 iterations. It is also demonstrated thatthe performance of this code under a modified decoder issignificantly improved at Eb/N0 higher than 4.5 dB comparedwith the rate 1/2 LDPC code from [9].

VI. IMPLEMENTATION AND RESULTS

To validate the proposed design, over-the-air transmissionsexperiments are conducted on the SDR platform. The test

Fig. 10. Simulation results of the modified min-sum decoder.

procedure can be described as follows. In the transmitter, bothpilot and information bits are generated in the host computer,and every 2 bits forms a symbol in the in-phase/quadratureform. These symbols in complex number form are thenoversampled to align with the clock rate on the FPGA andfed into the transmitter module on the FPGA. The output froman FPGA is fed to the AD9361 radio that generates the RFwaveform and transmits through the antenna. In the receiver,signals are received through the radio and get processed bythe receiver module on the FPGA, as described in Section V.Decoded bits are then sent to the host through the output databus. Finally, the host computer buffers the decoded bits andvalid bit and converts the bits to ASCII form for messagedisplay.

The frame structure is shown in Fig. 11. A 13-bit Barkercode is upsampled by 2 for the QPSK modulation withoutlosing peak autocorrelation characteristic. For easy demonstra-tion, the ASCII representation of “this is the test frame ##” is


Fig. 11. Frame structure with pilot and encoded bits.

Fig. 12. SDR platform used in this model.

chosen as the 175 information bits, where ## are 2 digits from00 to 99. The payload is the corresponding LDPC codewordof information data with an extra bit 0 appended. There are141 symbols (13 for the header and 128 for the payload) ineach frame.

A. Test Platform

This section describes how to integrate and run thecommunication system model. The Workflow Advisor inSimulink [29] is used to generate the HDL code for all blocksin the transmitter and receiver except those designed withthe black box tools, such as LDPC encoder and decoder.The modules designed with black block tools are replaced bythe customized HDL code. The Workflow Advisor integratesall the modules to build an entire transmitter and receivermodel and generates an FPGA bit file. By downloading theFPGA bit files into FPGAs, all the modules described inSections IV and V can be targeted to the SDR hardware. TheSDR platform is shown in Fig. 12, in which the left is theZC706 board as the receiver and the right is the ZedBoard asthe transmitter. Both FPGA boards hold one antenna and onedaughter board (AD-FCOMMS3-EBZ RF transceiver) withthe center frequency at 2.4 GHz (adjustable from 70 MHz to6.0 GHz) and baseband sample rate at 245.76 kHz (adjustablefrom 200 kHz to 56 MHz). The data rate of our experimentsis 122.88 Kb/s. Each FPGA board is connected with the hostcomputer through an Ethernet cable. The transmitter antennais separated from the receiver antenna by 2 m during the test.When the decoded message is obtained, it is printed on thescreen. Fig. 13 shows the received messages during real-time

Fig. 13. Readable message in real-time transmission.

TABLE II

FPGA HARDWARE RESOURCE UTILIZATION FOR THE

TRANSMITTER AND THE RECEIVER

transmission. The accumulative BER value is printed every50 frames although the actual Eb/N0 is not estimated.

B. Hardware Resources Utilization and Power Consumption

The details of hardware resource utilized in the transmitterand receiver are given in Table II. The table shows that thereceiver requires significantly more hardware than the trans-mitter; 71.8% of LUTs are used in the receiver. As analyzedin Section V-B, the decoder if implemented with the 255 ×255 H matrix would cost twice more hardware resources.This ZC706 FPGA platform may not be able to fit theentire receiver design if the LDPC decoder is not simplified.An FPGA power estimation shows that the transmitter andreceiver require a power consumption of 2.1 and 3.2 W, respec-tively. As mentioned in Section II, a small satellite usually hasa rechargeable battery in the capacity of about 192∼400 Wh.The total power consumption of the FPGA-based SDR is about5 W, which is within the power limitation of an ISC system.

C. Signal Quality in Real-Time Transmission

In digital communications, an error vector magnitude(EVM) value is a comprehensive measurement of the qualityof the receiver. The EVM value is calculated as in 4, whereI j and Q j are the in-phase and quadrature components of thej th receive signal, I∼

j and Q∼j are the in-phase and quadrature

components of the j th ideal reference signal, respectively, andN is the frame length. In this paper, the ideal reference signalsare the corresponding signals on the constellation points

EVM =

√√√√√1N

∑Nj=1

[(I j − I∼

j

)2 + (Q j − Q∼

j

)2]1N

∑Nj=1

[(I∼

j

)2 + (Q∼

j

)2] . (4)


TABLE III

EVM VALUES IN DIFFERENT BER LEVELS

Fig. 14. Performance comparisons with other typical codes.

When the transmission tests are conducted, the signals outof the frame synchronization module are collected and theBER values for the collected signals are obtained. Accordingto the collected signals and the reference signals, the EVMvalues can be calculated. From the theoretical BER versusEb/N0 curve, Eb/N0 can be estimated. Table III shows thecalculated EVM values and the estimated Eb/N0 values in dif-ferent BER levels. It is shown that the calculated EVM valuesmatch the theoretical ones [35]. A smaller EVM indicates thatthe collected signals are closer to the ideal signals.

D. Comparison With Other Works

Reed–Solomon (RS) codes are widely used in consultativecommittee for space data system’s (CCSDS’s) space commu-nications protocol specifications. Based on NASA’s researchon RS codes, the CCSDS has specified RS (255, 223) asthe FEC standard for the telemetry channel coding and RS(255, 239) as the forward and reverse link for an advancedorbiting system. However, the decoding process of these twocodes converges very slowly. A comparison with these twocodes and the Harris’s convolution code is made in Fig. 14.The RS codes suffer from very poor performance at Eb/N0lower than 6 dB. Compared with Harris’ half rate convolutioncode [8], our proposed LDPC decoder in 20 iterations hasa 1.6-dB performance gain at the BER of 10−6. AlthoughHarris’ convolution code shows a better performance than theRS codes at Eb/N0 lower than 6 dB, the convergence is stillslower than this LPDC code. The comparison reveals thatthe simplified LPDC decoder still keeps the property of fastconvergence and good performance.

TABLE IV

COMPARISON WITH OTHER WORKS

A comparison with other works on other aspects is sum-marized in Table IV. The fifth column lists Eb/N0 requiredat the BER of 1.0 × 10−6. Most of works prefer to adoptthe FPGA-based SDR platform to validate the related work.The comparison reveals that our proposed scheme has thehighest code rate and requires the least Eb/N0 at the BERof 1.0 × 10−6. Although the SDR in [11] needs the leastpower consumption, their scheme had the minimal code rateand shown worse performance.

VII. CONCLUSION

The optimized transmitter and receiver models for ISCsare presented and implemented on the FPGA-based SDRplatform. The proposed SDR is full-featured and suitable forthe power-limited small satellites system. The AGC, frequencycompensation, and decoder modules are optimized in order toreduce hardware resource utilization. A simplified decoder forthe EG-LDPC (255, 175) code is designed based on the modi-fied MSA. The simplification reduces the hardware utilizationby two thirds with only little performance degradation. Fur-thermore, the transmitter and the receiver are targeted on theFPGA-based SDR platform with a successful demonstrationof over-the-air transmission in the S-band.

REFERENCES

[1] C. Frost et al., “Small spacecraft technology state of the art,” NASAAmes Res. Center, Mountain View, CA, USA, NASA Tech. Rep. TP-2014-216648/REV1, 2014.

[2] R. Sun, D. Maessen, J. Guo, and E. Gill, “Enabling inter-satellitecommunication and ranging for small satellites,” in Proc. 9th Symp.Small Satellites Syst. Services, Funchal, Portugal, May 2010, pp. 1–15.

[3] T. Vladimirova, C. P. Bridges, J. R. Paul, S. A. Malik, andM. N. Sweeting, “Space-based wireless sensor networks: Design issues,”in Proc. IEEE Aerosp. Conf., Mar. 2010, pp. 1–14.

[4] T. Vladimirova, X. Wu, and C. P. Bridges, “Development of a satellitesensor network for future space missions,” in Proc. IEEE Aerosp. Conf.,Big Sky, MT, USA, Mar. 2008, pp. 1–10.

[5] B. Edwards, “Spectrum requirements and allocation survey report andrecommendations,” NASA Goddard Space Flight Center, Greenbelt,MD, USA, Tech. Rep. 1, 2002.

[6] R. Radhakrishnan, W. W. Edmonson, F. Afghah,R. M. Rodriguez-Osorio, F. Pinto, and S. C. Burleigh, “Survey ofinter-satellite communication for small satellite systems: Physical layerto network layer view,” IEEE Commun. Surveys Tuts., vol. 18, no. 4,pp. 2442–2473, 4th Quart., 2016.

[7] M. H. Novak and J. L. Volakis, “Ultrawideband antennas for multibandsatellite communications at UHF–Ku frequencies,” IEEE Trans. Anten-nas Propag., vol. 63, no. 4, pp. 1334–1341, Apr. 2015.

[8] J. A. Downey, R. C. Reinhart, and T. J. Kacpura, “Pre-flight testing andperformance of a Ka-band software defined radio,” in Proc. 30th AIAAInt. Commun. Satell. Syst. Conf. (ICSSC), Sep. 2012, pp. 1–13.


[9] G. Wu, Y. Bai, and Z. Sun, “Research on formation of microsatellitecommunication with genetic algorithm,” Sci. World J., vol. 2013, no. 4,pp. 1–7, Sep. 2013.

[10] W. Edmonson et al., “Systems engineering of inter-satellite communi-cations for distributed systems of small satellites,” in Proc. 9th Annu.IEEE Int. Syst. Conf. (SysCon), Apr. 2015, pp. 705–710.

[11] M. R. Maheshwarappa, M. D. J. Bowyer, and C. P. Bridges, “Improve-ments in CPU & FPGA performance for small satellite SDR applica-tions,” IEEE Trans. Aerosp. Electron. Syst., vol. 53, no. 1, pp. 310–322,Feb. 2017.

[12] D. Zhou, M. Sheng, X. Wang, C. Xu, R. Liu, and J. Li, “Mission awarecontact plan design in resource-limited small satellite networks,” IEEETrans. Commun., vol. 65, no. 6, pp. 2451–2466, Mar. 2017.

[13] L. Goeller and D. Tate, “A technical review of software defined radios:Vision, reality, and current status,” in Proc. IEEE Mil. Commun. Conf.(MILCOM), Oct. 2014, pp. 1466–1470.

[14] R. W. Stewart et al., “A low-cost desktop software defined radio designenvironment using MATLAB, simulink, and the RTL-SDR,” IEEECommun. Mag., vol. 53, no. 9, pp. 64–71, Sep. 2015.

[15] Vulcan Wireless SDR Products. [Online]. Available: http://www.vulcanwireless.com/products/mbt

[16] Spectrum Software Defined Radio. [Online]. Available: http://www.militaryaerospace.com/articles/print.html

[17] M. R. Maheshwarappa, M. Bowyer, and C. P. Bridges, “Software definedradio (SDR) architecture to support multi-satellite communications,” inProc. IEEE Aerosp. Conf., Mar. 2015, pp. 1–10.

[18] J. R. Wertz and W. J. Larson, Space Mission Analysis and Design.New York, NY, USA: Springer-Verlag, 1999.

[19] A. Budianu, T. J. W. Castro, A. Meijerink, and M. J. Bentum, “Inter-satellite links for cubesats,” in Proc. IEEE Conf. Aerosp., Big Sky, MT,USA, Mar. 2013, pp. 1–10.

[20] S. Lin and D. J. Costello, Error Control Coding. Upper Saddle River,NJ, USA: Prentice-Hall, 2004.

[21] M. Qiu, L. Yang, Y. Xie, and J. Yuan, “On the design of multi-dimensional irregular repeat-accumulate lattice codes,” IEEE Trans.Commun., vol. 66, no. 2, pp. 478–492, Feb. 2018.

[22] R. Sun and J. Liu, “Distributed SR-LDPC codes over multiple-accessrelay channel,” in Proc. Int. Conf. Intell. Netw. Collaborative Syst.,Sep. 2013, pp. 626–628.

[23] D. Kim and J. Ha, “A low-complexity decoding algorithm for concate-nated tree codes,” in Proc. Int. Conf. Inf. Commun. Technol. Converg.,Oct. 2015, pp. 488–490.

[24] U. Mengali, Synchronization Techniques for Digital Receivers.New York, NY, USA: Springer-Verlag, 2013.

[25] M. S. Amin, M. B. I. Reaz, J. Jalil, and L. F. Rahman, “Digital modulatorand demodulator IC for RFID tag employing DSSS and barker code,”J. Appl. Res. Technol., vol. 10, no. 6, pp. 819–825, 2012.

[26] W. Ryan and S. Lin, Channel Codes: Classical and Modern. Cambridge,U.K.: Cambridge Univ. Press, 2009.

[27] M. Luise and R. Reggiannini, “Carrier frequency recovery in all-digitalmodems for burst-mode transmissions,” IEEE Trans. Commun., vol. 43,nos. 2–4, pp. 1169–1178, Feb. 1995.

[28] Generate Black Box Interface for Subsystem. [Online]. Available: https://www.mathworks.com/help/hdlcoder/ug/blackbox-implementation-for-subsystem-blocks.html

[29] Basic HDL Code Generation With the Workflow Advisor. [Online]. Avail-able: https://www.mathworks.com/help/hdlcoder/examples/basichdl-code-generation-with-the-workflow-advisor.html

[30] C. R. Johnson, Jr., W. A. Sethares, and A. G. Klein, Software ReceiverDesign: Build Your Own Digital Communication System in Five EasySteps. Cambridge, U.K.: Cambridge Univ. Press, 2011.

[31] M. Frerking, Digital Signal Processing in Communications Systems.New York, NY, USA: Springer-Verlag, 2013.

[32] K. Vasudevan, “Iterative detection of turbo-coded offset QPSK in thepresence of frequency and clock offsets and AWGN,” Signal, ImageVideo Process., vol. 6, no. 4, pp. 557–567, Nov. 2012.

[33] K. Zhang, X. Huang, and Z. Wang, “A high-throughput LDPC decoderarchitecture with rate compatibility,” IEEE Trans. Circuits Syst. I, Reg.Papers, vol. 58, no. 4, pp. 839–847, Apr. 2011.

[34] K. Zhang, X. Huang, and Z. Wang, “High-throughput layered decoderimplementation for quasi-cyclic LDPC codes,” IEEE J. Sel. AreasCommun., vol. 27, no. 6, pp. 985–994, Aug. 2009.

[35] H. A. Mahmoud and H. Arslan, “Error vector magnitude to SNRconversion for nondata-aided receivers,” IEEE Trans. Wireless Commun.,vol. 8, no. 5, pp. 2694–2704, May 2009.

Xin Cai is currently working toward the Ph.D.degree at Xidian University, Xi’an, China.

He was a Visiting Student with the Worcester Poly-technic Institute, Worcester, MA, USA, from 2015 to2017. His current research interests include infor-mation theory, channel coding, and software-definedradio.

Mingda Zhou received the B.S degree from Xi’anJiaotong University, Xi’an, China, in 2013, and theM.S. degree from the Worcester Polytechnic Insti-tute, Worcester, MA, USA, in 2015, where he iscurrently working toward the Ph.D. degree.

His current research interests include multiple-input multiple-output systems, synchronization, andfield-programmable gate array implementation.

Tian Xia (M’01–SM’06) received the B.S. degreein electronics and information engineering from theHuazhong University of Science and Technology,Wuhan, China, the M.S. degree in electrical engi-neering from the Nanjing University of Posts andTelecommunications, Nanjing, China, and the Ph.D.degree in electrical and computer engineering fromThe University of Rhode Island, Kingston, RI, USA.

He is currently a Full Professor at the Departmentof Electrical and Biomedical Engineering, Univer-sity of Vermont, Burlington, VT, USA. His current

research interests include mixed signal VLSI circuit design and test, andmicrowave and radio-frequency sensing circuit and system.

Wai H. Fong received the B.S. and M.S. degreesin electrical engineering from George WashingtonUniversity, Washington, DC, USA, in 1984 and1992, respectively, and the Ph.D. degree in electricalengineering from George Mason University, Fairfax,VA, USA, in 2015.

He is currently with the Goddard Space FlightCenter, NASA, Greenbelt, MD, USA. His expertiseis in the areas of digital communications specializingin error correction coding and estimation/detectiontheory. His current research interests include water-

color and oil painting, woodworking, and cosmology.

Wing-Tsz Lee received the B.S. degree in electri-cal engineering from the University of Maryland,College Park, MD, USA, in 2005, and the M.S.degree in electrical engineering from George MasonUniversity, Fairfax, VA, USA, in 2015.

She is currently a Communications Engineer atthe Goddard Space Flight Center, NASA, Greenbelt,MD, USA. Her current research interests include thearea of digital communications.

Xinming Huang (M’01–SM’09) received the Ph.D.degree in electrical engineering from Virginia Tech,Blacksburg, VA, USA, in 2001.

He was a Member of the Technical Staff atthe Wireless Advanced Technology Laboratory, BellLabs of Lucent Technologies, Whippany, NJ, USA,from 2001 to 2003. He is currently a Professor atthe Department of Electrical and Computer Engi-neering, Worcester Polytechnic Institute, Worcester,MA, USA. His current research interests include theareas of circuits and systems, with an emphasis on

wireless communications, information security, computer vision, and machinelearning.

Documents

Low-Power SDR Design on an FPGA for Intersatellite ...users.wpi.edu/~xhuang/pubs/2018_cai_tvlsi.pdfpaper presents a complete software-deﬁned radio (SDR) model for intersatellite