+ CS 325: CS Hardware and Software Organization and Architecture Memory Organization

+ CS 325: CS Hardware and SoftwareOrganization and Architecture

Memory Organization

+Storage/Memory Hierarchy

+Memory Storage Characteristics

Location

Capacity

Unit of transfer

Access method

Performance

Physical type

Physical characteristics

Organization

+Memory Storage - Location

CPU Registers L1, L2, L3, L4 Cache

Internal Main Memory (System RAM) BIOS (EEPROM)

External Magnetic Disk (HDD) Non Volatile Solid State (SSD) Optical Magnetic Tape

+Memory Storage - Capacity Word size

The natural unit of organization Expected size of most data and instructions Typically 32 bits or 64 bits

Past: 16 bits

Typical Storage L1 Cache: 32 – 64 KB per core L2 Cache: 128 – 512 KB per core L3 Cache: 2 – 8 MB (shared) L4 Cache: 0 – 128 MB (video memory) Main Memory (RAM): 4 – 32 GB (Typical Desktop) HDD Cache: 16 – 64 MB SDD: 64 – 512 GB HDD: 200 – 2000 GB (Inexpensive, but extremely slow) Optical:

DVD: 4.7 – 17.08 GB Blu-ray: 25 – 100 GB

Magnetic Tape: 10 – 35 TB per cartridge (uncompressed)

+Performance – Transfer Rate Example Problem

Assume we have a 32-Mbit SDRAM memory with 8 bits simultaneously read and a cycle time of 250 ns.

How fast can data be moved out of memory?

8b * (1/250ns)

= 8b * (4x106/s)

= 32 Mbps

= 4 MBps

+Memory Storage – Physical Types Semiconductor

Cache Main Memory (RAM) SSD

Magnetic HDD Tape

Optical CD DVD Blu-Ray

Others Bubble Hologram

+Memory Storage – Physical Characteristics

Volatility

Erasable

Power consumption/Heat

+Memory Storage – Hierarchy List Registers

L1 Cache

L2 Cache

L3 Cache

Main Memory

Disk Cache

SSD

HDD

Optical

Tape

+

Memory Basics

+Semiconductor Memory

Random Access Memory (RAM): All semiconductor memory is random access

Directly accessed by address logic Read/Write Volatile

Requires constant power supply Temporary storage Static

Holds data Dynamic

Periodically refreshes charge

+Static RAM

Bits stored as on/off switches (transistors)

No charges to leak

Does not need refresh circuits

No refreshing needed when powered

Larger per bit

More expensive

Faster

Example: Cache Memory:

+Dynamic RAM

Bits stored as charge in capacitors (also uses transistors) Charges leak from capacitors Needs refreshing, even when powered

Needs refresh circuits

Smaller per bit

Less expensive

Slower

Asynchronous and Synchronous DRAMs

Example: Main memory

+Read Only Memory (ROM)

Permanent storage

Microprogramming

Library subroutines

Systems programs

Function tables

+Measures of Memory Technology

Density

Latency and cycle time

+Memory Density Refers to memory cells per square area of silicon

Usually states as number of bits on standard chip size

Examples: 1 mb chip 4 mb chip

Memory cells typically structured in arrays 1Mb x 1 chip 256 Kb x 4 chips

Note: higher density chip generates more heat


Internal Memory

+Semiconductor Memory Types

+Flash Memory

Provides block electrical erasure but not byte level Typical block size 512, 2048, 4096

High density One transistor per bit

Fast read speeds, but not as fast as DRAM

Very slow erase speed

+Error Detection and Correction

Hard Failure Permanent defect Caused by

Harsh environmental abuse Manufacturing defects Wear

Soft Error Random, non-destructive No permanent damage to memory Caused by

Power supply problems


A single parity bit can be used to detect (most) errors in a word

Parity bit test can fail to detect errors when there is more than one bit error

Hamming codes can be used to detect and correct errors


Bits are occasionally flipped in transmission. For example: 1101001 is sent, but 0101011 is received.

Adding redundancy can allow us to detect, and possibly correct, some errors of this type.

Simple approach: Repeat each bit Repeat each bit twice. For bit x, transmit xx. If the receiver gets two

different bits, it requests a retransmission. This is an error detecting code.

Allows for one error to be detected, but is not error correcting since retransmission is necessary

Repeat each bit three times. For each bit x, transmit xxx. Now the receiver can correct a single error.

Why?

+Problem with the simple approach

The receiver can detect and correct bit errors if each bit is transmitted three times. How does this affect performance?

Better approach Parity check codes

Has the ability to detect odd number of bit flips using a single parity bit.

+Calculating bit string parity

A bit string has odd parity if the number of 1s in the string is odd. 100011, 1, 000010 have odd parity

A bit string has even parity if the number if 1s in the string is even. 01100, 000, 11001001 have even parity

Assume 0 is an even number

+Parity check code

Assume we are transmitting blocks of k bits. A block (w) of length (k) is encoded as (wa), where the value of the

parity bit (a) is chosen so that (wa) has even parity.

Example: If w = 10110, we send wa = 101101, which has even parity

With no bit flips in the transmission, the receiver gets the bit string exactly as it was sent by the sender. Bit string has even parity.

If there are an odd number of bit flips in the transmission, the receiver gets a bit string with odd parity. Retransmission is requested.

If there are an even number of bit flips in the transmission, the receiver gets a bit string with even parity. The error(s) go undetected.

Another solution?

+2D parity check code

Blocks of bits are organized in rows and columns m x n matrix The parity bit of each row is calculated, and appended to

the row before it is transmitted The parity of each column is calculated, and the parity bit

of the entire matrix is computed. These are also transmitted to the receiver

m + n + 1 parity bits are computed mn + m + n + 1 bits are sent to the receiver

Efficiency becomes greater as block size increases

+2D parity check

Example: Original data: 1100, 1011, 0111, 0101

Row Parity

Column Parity Matrix Parity bit

MN + M + N + 1 bits transferred. 5*5 + 5 + 5 + 1 = 36 bits

1 1 0 0 0

1 0 1 1 1

0 1 1 1 1

0 1 0 1 0

0 1 0 1 0

+Hamming Code

Linear error detecting/correcting codes invented by Richard Hamming in 1950. Can detect up to 2 bit errors Can correct 1 bit errors

+Hamming Code – Parity bits

Hamming code works by propitiating parity bits throughout a bit string of size (w)

(p) parity bits creates a bit string of size 2m – 1, of which 2m – m – 1 bits can be used for data. Common Hamming code sizes:

Hamming(3,1), 2 parity bits Hamming(7,4), 3 parity bits Hamming(15,11), 4 parity bits Hamming(31,26), 5 parity bits

Is read as Hamming(total bits, data bits)

+Hamming Code

Example: Using Hamming(7,4), create the Hamming codeword for the

following 4 bit string: 0101 Hamming(7,4)

7 total bits 4 data bits 3 parity bits

Parity bits are always located in the codeword at positions of 2n. P1 = 20 = 1

P2 = 21 = 2

P3 = 22 = 4

+Hamming Code


following 4 bit string: 0101

1 2 3 4 5 6 7

P1 P2 P3

+Hamming Code



1 2 3 4 5 6 7

P1 P2 0 P3 1 0 1

+Hamming Code



Now, to calculate the parity bits.

1 2 3 4 5 6 7

P1 P2 0 P3 1 0 1

+Hamming Code

To calculate P1: find parity of substring(1,3,5,7)

1 2 3 4 5 6 7

P1 P2 0 P3 1 0 1

+Hamming Code

To calculate P1: find parity of substring(1,3,5,7) P1 0 1 1 = 0

1 2 3 4 5 6 7

P1 P2 0 P3 1 0 1

+Hamming Code


To calculate P2: find the parity of substring(2,3,6,7)

1 2 3 4 5 6 7

P1 P2 0 P3 1 0 1

+Hamming Code


To calculate P2: find the parity of substring(2,3,6,7) P2 0 0 1 = 1

1 2 3 4 5 6 7

P1 P2 0 P3 1 0 1

+Hamming Code



To calculate P3: find the parity of substring(4,5,6,7)

1 2 3 4 5 6 7

P1 P2 0 P3 1 0 1

+Hamming Code




1 2 3 4 5 6 7

P1 P2 0 P3 1 0 1

+Hamming Code




Now we know P1, P2, and P3.

1 2 3 4 5 6 7

P1 P2 0 P3 1 0 1

+Hamming Code




Now we know P1, P2, and P3, and can calculate the codeword:

0 1 0 0 1 0 1

1 2 3 4 5 6 7

P1 P2 0 P3 1 0 1


Cloud Architectures

+Outline

Introduction Software as a Service (SaaS) Platform as a Service (PaaS) Infrastructure as a Service (IaaS)

Background

Computational Resource Load Balancing

+Introduction Scalable resource hosting

Storage Computational Software APIs Applications

Tailored services Software as a Service (SaaS) Platform as a Service (PaaS) Infrastructure as a Service (IaaS)

Billed like a utility Monthly, depending on usage

+Introduction No formal definition!

A set of service oriented architectures, which allow users to access a number of resources in a way that is scalable, elastic, on-demand, and cost-efficient

ServerCloud Interface

…

Client

Client

Compute

Compute Service

Compute Service

ComputeStorage Service

Other Services

+Introduction

ServerCloud InterfaceCompute

Compute Service

Compute Service

ComputeStorage Service

Other Services

Infrastructure as a service(IaaS) [2-4]

Lowest service level in cloud stack.

Provides compute, storage, and networking services using hardware virtualization.

Platform as a service(PaaS) [2-4]

Software as a service(SaaS) [2-4]

2. Edmonds, A., S. Johnston, T. Metsch, and G. Mazzaferro 3. Liu, F., J. Tong, J. Mao, R. Bohn, J. Messina, M. Badger, and D. 4. Canonical Group Ltd.

+Introduction

Typical General Purpose Private Cloud Architecture (Eucalyptus [5])5. Eucalyptus Systems

+Types of Clouds Public Cloud

Marketed based on Resources offered Availability Security Price

Local Cloud Cloud architectures tailored to an organization’s needs

Hybrid Cloud Combination of public and local cloud resources


Cloud Architecture Background

+Background

Concept of delivering computing resources through a global network 1960s

Computer Clusters 1970s

Grid Computing 1990s

Cloud: Evolution of Grid and Cluster 2000s

+Cloud Layers

Clients – Thick client, thin client, mobile client Application Layer – SaaS Platform Layer – PaaS Infrastructure Layer – IaaS Hardware Layer – Physical cloud resources

Client

Aplication

Platform

Infrastructure

Hardware

+Local Cloud Architecture - Eucalyptus Open source cloud architectures have different names

for components. Share basic concepts

Five components: Cloud Controller Node Walrus (Image) Storage Node User Persistent Storage Node Cluster Controller Node Compute Node

+Notes on Resource Virtualization Cloud architectures generally provide physical

resources to end users in the form of virtual machines

Virtual machines execute as process instances within an instance manager called a “Hypervisor”. Allows multiple guest operating systems to run on a single

host.

+Notes on Resource Virtualization

Full virtualization Paravirtualization Kernel based virtualization

Unmodified guest kernel

Modified guest kernel Unmodified guest kernel

Not aware of hypervisor Aware of hypervisor Not aware of hypervisor

Open or closed source os

No closed-source os support

Open or closed source os

Slowest due to device emulation overhead

May have better performance due to modified kernel

Best performance due to matching guest and host kernel

Documents

+ CS 325: CS Hardware and Software Organization and Architecture Memory Organization