Sub-Nyquist Sampling DSP & SCD Modules Presented by: Omer Kiselov, Daniel Primor Supervised by:...

Sub-Nyquist SamplingDSP & SCD Modules

Presented by: Omer Kiselov, Daniel PrimorSupervised by: Ina Rivkin, Moshe Mishali

Winter 2010 High Speed Digital Systems labElectrical Engineering faculty

Technion – Israeli institute of technology

Outline

• Overview – Goals and discussion• Algorithm review• Implementation in hardware• Changes for Adaptation to hardware• Evaluation• Possible Optimization & Future Work

Overview

• The Goal system• The module’s Objectives• Interface

Memory

CTF(Support

recovery)DSP

(Baseband)

AnalogBack-end

(Realtime)

Detector

Expand1:q

DELAYFIFO

SUPPORT & MatrixDSP

(Baseband)

, 1i pY AZ Z f X f i L f

DSP & SUPPORT CHANGE DETECTOR

A matrix vector 432 bits

Support Anlysis vector101 bits

First Beta (For QR decomposition)

36 bits

Samples Bundle 432 bits

Support Changed1 bit

Valid Supports 1 bit

A Matrix Address 9 bits

Valid samples 1 bit

Outline

Algorithm Review

• Pseudo-Inverse– Matrix Decomposition– Matrix Inversion– Matrix Multiplication

• Support Change Detection– Support threshold evaluation attempt

Pseudo inverseReal Time Vector MultiplierSupport Change Detector

Algorithm Review – Pseudo Inverse

• Matrix Decomposition• QR Decomposition

• Using Householder Reflections

T Tn m n mn n

n n n m

A A A A

1...i i kQ Q Q

Algorithm Review – Pseudo Inverse

• Matrix Inversion – Gaussian Elimination

• Matrix MultiplicationMatrix

MultiplierVector

Multiplier

Matrix Multiplier’s Common Interface

Algorithm Review - SCD• The support change detector is a vector multiplier – given

one row of the pseudo inversed A matrix and multiply it by the signal to see if any energy there is not noise.

• Threshold generation attempt:

– If there was no support change

– If we replace W with the average:

– The generated value doesn't show any false alarms. But may have misdetection on several cases where the SNR is low.

*Eventually The Threshold was defined as an input by the user.

min minamplitude noiseThreshold sample in range samples A

noiseSNR

signal

noisenoise

signal

AT sample A

1* )max) ((sample samp FrameOrgan noiseP y W P A P T

24 24 22

24 _1 1

2 2 22_ 1

24 max 24 max) ( 24 24

samples samp avg samp noise avgi i

samp noise avg

samp avgi

P y W y W

y W P Anoise threshold P T

y W P T P T

Our estimated guess for threshold is 000001000110010100 (for the AM demo)~0.3

DSP & SCD system operation

QR Decomposition

Upper triangular

matrix inverse

Matrix multiplier

Q’Auxiliary multiplicationsReflections creationReflection multiplication

R inversed

Delay FIFO

A Matrix RAM

Real Time Matrix-Samples Multiplier

Ping-Pong Buffer (RAM)

A dagger

Support Change Detector

Control Vector

Supportindexes A_s

SamplesFromExpand

Reconstructed Signal

Outline

Implementation In Hardware

QR Decompositio

Inverting an upper

triangular matrix

Matrix Multiplier

Block (Entities) Definition – Pseudo Inverse

QR Decompositi

Matrix Multiplier

Matrix Inversion

• Block (Entities) Definition – Pseudo Inverse• QR Decomposition

Phase 2Phase 1

24 Multipliers

Beta calculation unit

Matrix Inversion Unit

• Block (Entities) Definition – Pseudo Inverse

Vector Inversion UnitVector Inverter

FIFO for Original R Matrix

Matrix Multiplier

Real Time Mult

Outline - Adaptation to Hardware

• Overview – Goals and discussion• Algorithm review• Implementation in hardware• Adaptation to hardware

– Complex Enhance– Normalizing the Input– Resolution (Overflow) discussion– SCD – running average– Timing issues

• Evaluation• Possible Optimization & Future Work

Complex Enhance

• To avoid all complex multiplications we changed the structures of the matrix.

• The matrix is 4 times bigger. For every complex vector multiplication we can still multiply 1 vector with another vector the ordinary way, and get the correct results.

( ) ( )

i j i j

i ji j i j

i rownumber and j columnnumber

real a imag aa A

imag a real a

Normalizing the Input

• Accuracy falls with smaller mantissa

• Matrices can be normalized pre inverse and post inverse

• Hence:

• Motivation– The real data differed

from the synthetic data given – thus 18 bits are not enough (we need to represent both the number and 1 divided by the number).

– Normalizing the matrix allows us to play with the fraction to minimize error and underflow.

1 †2

z y D A

D isdiagonal

z D y A

Support Change Detection – with running average

Vector multiplier

Cycle counter

Control vector RAM

Samples

+Detection

Threshold

Timing

• Deep pipeline– We incorporated a deeper pipeline to make the module

work on the high desired frequency. The Quartus currently shows that the module may perform only up to the given frequency. It is possible to rise it by raising the pipe levels in the bottlenecks found in the design.

• Clocks– Main clock – 20 MHz may rise to 70MHz– Working clock for pseudo inverse – 100 MHz – currently

non flexible

• Hardware reuse– The matrix multiplier and the inverse unit use a single unit

for a vector size for many iterations – hence they make the bottlenecks.

Bottlenecks in the design

• Matrix Inverse• Matrix Multiplier• Beta calculation in the QR – heavy arithmetic actions taking place.

• If we replace the arithmetic units within these entities with higher pipeline units (the division is 23 cycles, the square root is 11 cycles and the multiplier is 2) – the maximal frequency will rise.

• No real reason to activate with a higher clock except when memory on the chip is lacking for the delay FIFO or speed being an actual necessity.

Resource Consumption

• Total numbers taken from Stratix III FPGA EP3SE260F1152C2

AloneWith architecture

totalusageusage with architecture

architecture consumption

out of total

5194062,913203,52025.52%30.91%5.39%17.44%combinational ALUT's

0640101,7600.00%0.63%0.63%100.00%memory ALUT's

1778848,820203,5208.74%23.99%15.25%63.56%logic registers

1002241,240,80815,040,5120.67%8.25%7.58%91.92%memory bits

75275276897.92%97.92%0.00%0.00%dsp block 18-bit elements

0580.00%62.50%62.50%100.00%PLLs

0240.00%50.00%50.00%100.00%DLLs

Resources on FPGAUsage percentageResources

DSP – Runtime Analysis

• Worse case pseudo inverse timing (for 11 support vectors) is a delay of 0.5 milliseconds. Hence an appropriate delay FIFO is required.

• The SCD and reconstruction multiplier works in real time (1 cycle 50 ns).

Outline

• Overview – Goals and discussion• Algorithm review• Implementation in hardware• Changes for Adaptation to hardware• Evaluation

– Testing method– Results– discussion– Conclusions

• Possible Optimization & Future Work

Evaluation - Testing

Input text files

Output text files

Matlab (fixed

point)=

Logical Testing

Expanded

samples

CTF output support

VHDL – Test bench

A matrix memory

Status parser

Functional module

DSP SCD

Evaluation - Testing

Input text files

Output text files

Analysis &

Comparison to

Modelsim

On Chip Testing

Expanded

samples

CTF output support

Debug Environment

A matrix RAM

CTF model & FIFO ctrl

Functional module

DSP SCD

Evaluation - Results

• Results of the run on FPGA with the following signals– Fm259_252_sin824_809– Fm259_252_am872.697– Am_872.697_sin824

• SCD test

0 10 20 30 40 50 60 70 80 90-200

Frequency )MHz(

Reconstructed sequence #1

0 10 20 30 40 50 60 70 80 90-180

Frequency )MHz(

0 10 20 30 40 50 60 70 80 90-200

Frequency )MHz(

Reconstructed sequence fixed point modelsim #1

0 10 20 30 40 50 60 70 80 90-180

Frequency )MHz(

Reconstructed sequence fixed point modelsim #2FPGA output

0 10 20 30 40 50 60 70 80 90-200

Frequency )MHz(

frequency )

0 10 20 30 40 50 60 70 80 90-180

Frequency )MHz(

frequency )

0 10 20 30 40 50 60 70 80 90-200

Frequency )MHz(

frequency )

0 10 20 30 40 50 60 70 80 90-180

Frequency )MHz(

frequency )

Matlab simulation

0 10 20 30 40 50 60 70 80 90-180

Frequency )MHz(

frequency )

0 10 20 30 40 50 60 70 80 90-180

Frequency )MHz(

frequency )

0 10 20 30 40 50 60 70 80 90-180

Frequency )MHz(

frequency )

0 10 20 30 40 50 60 70 80 90-180

Frequency )MHz(

frequency )

0 10 20 30 40 50 60 70 80 90-180

Frequency )MHz(

0 10 20 30 40 50 60 70 80 90-180

Frequency )MHz(

0 10 20 30 40 50 60 70 80 90-180

Frequency )MHz(

0 10 20 30 40 50 60 70 80 90-180

Frequency )MHz(

Reconstructed sequence fixed point modelsim #2FPGA output

Matlab simulation

FPGA output

Matlab simulation

0 20 40 60 80-180

Frequency )MHz(

0 20 40 60 80-180

Frequency )MHz(

0 20 40 60 80-180

Frequency )MHz(

0 20 40 60 80-180

Frequency )MHz(

Reconstructed sequence fixed point hardware #1

0 20 40 60 80-180

Frequency )MHz(

0 20 40 60 80-180

Frequency )MHz(

0 20 40 60 80-180

Frequency )MHz(

0 20 40 60 80-180

Frequency )MHz(

0 20 40 60 80-180

Frequency )MHz(P

0 20 40 60 80-180

Frequency )MHz(

0 20 40 60 80-180

Frequency )MHz(

0 20 40 60 80-180

Frequency )MHz(

Support changed

Support Change experiment

Evaluation - Discussion

• Inspection of correctness were done in comparison to Matlab under the following:– Maximal MSE of the calculated pseudo inversed

matrix values– Maximal and averaged values of the difference

between the results of the matlab simulation and the actual results

– By looking and inspecting differences….

• The SCD experiment was composed of two uneven support samples bundles put together to inspect correctness and conclude further about the support threshold.

Evaluation – conclusions

• The MSE inspected for the inversed matrix is 10^-3

• The MSE for the reconstructed signal:– Maximal 0.04– Averaged ~10^-6

• No actual conclusions were made about the support changes in function – the predictable behavior of the function is only in the support changes.

Outline

Future Work

• Possible Optimizations– Modification to the inversion algorithm for

higher parallelism.– Scaling hardware to increase performance.

• Possibly changing the resolution of the calculations to 22 or more bits for more accurate resolution - great cost in hardware.

• Integration

Summary

• We have managed to activate the DSP and SCD module on FPGA and got sufficient results.

• We introduced an algorithm for calculating the support threshold.

• We changed most architecture to support pipeline and use minimal hardware – vector resolution.

• Changed debug environment to support a different FPGA.

Sub-Nyquist Sampling DSP & SCD Modules Presented by: Omer Kiselov, Daniel Primor Supervised by:...

Documents

Estimation, morphometry and ultrastructure of ovarian ... · 18,62±3,40μm, 12,28±2.37μm e 6,10±0,93μm para primor- Estimation, morphometry and ultrastructure of ovarian preantral

Primor de Chola

Producto Aceite Primor

E-mail: Site: tribuna@hardonline.com.br ... · E-mail - tribuna@hardonline.com.br Panificadora e Lanchonete Primor Produtos feitos com amor O melhor pão é na Primor Calçadão Sílvio

nascido escravo miolo - WordPress.com · • Nascido Escravo Publicado inicialmente em 1525, A Escravidão da Vontade é um primor de composição polêmica. Nesta obra transparecem

Location Estimation in Sensor Networks Moshe Mishali

Apresentação Grupo Primor - · 39,7% . Facturação Empresas Exportadoras . ... • Produto do Ano 2008 ... fermentação de bactérias de ácido láctico

Aceite de Oliva Primor - upt

Using Aspects to Support the Software Process: XP over Eclipse Oren Mishali and Shmuel Katz Technion, Israel Institute of Technology

PRIMOR 2060 H - KUHN Farm Machinery · Kuhn, the farm equipment specialist, offers . technological solutions to help you harvest the fruit of your investments quickly. ... The PRIMOR

Model-based Compressive Sensing - …people.ee.duke.edu/~lcarin/baraniuk.pdf[Eldar, Mishali], [Baron, Duarte et al], [B, C, Duarte, Hegde] • Ex: clustered signals ... sparsity

CLASSIC SIX METRE NEWSLETTER · Sun Ray, 59. Tara, 60. Totem ... (Marine Technology, April ... At the time that the RRYC was revised in 1958, Kiselov, now based at the Leningrad Experimental

MADEIRA - AutoSpec Media Server · RIMOR Estilo na medida certa para sua cozinha Stylish kitchens Estilo en la medida cierta para su cocina 34 Kit Primor collection | Línea kit Primor

Si te quieres divertir con encanto y con primor solo

PRIMOR 2060 H PRIMOR 2060 Pailleuse portée et …...Depuis la cabine du tracteur, la visibilité au chargement est totale grâce à la position « surbaissée » de la turbine. Pour

Problems Primor for Olympiads

Moshe Mishali and Yonina C. Eldarsimplification is oversampling, which is often used to replace the ideal brickwall filter by more flexible filter designs and to combat noise

Tlaxcala : datos por ejido y comunidad agrariainternet.contenidos.inegi.org.mx/contenidos/productos/...por Ejido y Comunidad Agraria, cuyo objetivo primor- - Población ocupada en

PRIMOR 4260 M CUT CONTROL Information produit - kuhn.fr · Pailleuse distributrice tout fourrage avec dispositif de hachage PRIMOR 4260 M CUT CONTROL Information produit La PRIMOR

Repaso de las relaciones económicas internacionales … · Estado en 1959 hasta la actualidad. Tiene como objetivo primor- dial dar ... paralización cubana por falta de combustible,