Upload
pauline-fowler
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
UC Regents Spring 2014 © UCBCS 152: L2 Single-Cycle Wrap-up
2014-1-23
John Lazzaro(not a prof - “John” is always OK)
CS 152Computer Architecture and Engineering
www-inst.eecs.berkeley.edu/~cs152/
TA: Eric Love
Lecture 2 – Single Cycle Wrap-up
Play:
NvidiaTegra K1 Tech Talk
5:30 PMthis
Thursday in the Woz.
Tegra K1 remixes theKepler GPU
architecture for lowpower
SOCs.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Topics for today’s lecture
Single-Cycle CPU Design
Very Long Instruction Words (VLIW): Doing more work in a single cycle.
Short Break.
Walk up to John and Eric during the break to discuss
individual administrative issues.
UC Regents Spring 2014 © UCBCS 152: L2 Single-Cycle Wrap-up
Single Cycle CPU Design
Single Cycle CPU design
All instructions execute in a single cycle of the clock.
(positive edge to positive edge)
All state elements act like positive edge-triggered flip flops.
D Q
clk
No delayed branches.
No delayed loads.
The PC of the next instruction executed after a taken branch is the branch target of the taken branch.
The next instruction executed after the load sees the value that was retrieved by the load in the appropriate register.
We will re-introduce delayed branch and delayed load semantics in the pipelining
lecture.
Contract changes
Architected state
Main Memory 2^32 bytes
organized as 32-bit words
...
00000000
00000004
00000008
FFFFFFFF
FFFFFFFC
FFFFFFF8
addr
next instr
Program Counter (PC)32 bits
32 32-bit Registers
...
R31R30
R1
R0 [hardwired to constant 0]
The state visible to the programmer.
The state that appears in machine
language instructions.
Architected state
Main Memory 2^32 bytes
organized as 32-bit words
...
00000000
00000004
00000008
FFFFFFFF
FFFFFFFC
FFFFFFF8
addr
next instr
Program Counter (PC)32 bits
32 32-bit Registers
...
R31R30
R1
R0 [hardwired to constant 0]
All state elements in our single-cycle CPU design hold
architected state.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Recall: MIPS R-format instructions
InstructionFetch
InstructionDecode
OperandFetch
Execute
ResultStore
NextInstruction
Fetch next inst from memory:012A4020
opcode rs rt rd functshamtDecode fields to get : ADD $8 $9 $10
“Retrieve” register values: $9 $10
Add $9 to $10
Place this sum in $8
Prepare to fetch instruction that follows the ADD in the program.
Syntax: ADD $8 $9 $10 Semantics: $8 = $9 + $10
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Goal #1: An R-format single-cycle CPU
opcode rs rt rd functshamt
Syntax: ADD $8 $9 $10 Semantics: $8 = $9 + $10
Sample program:ADD $8 $9 $10SUB $4 $8 $3AND $9 $8 $4...
How registers get their initial values are not of concern to us right now.
No loads or stores: machine has no use for data memory, only instruction memory.
No branches or jumps: machine only runs straight line code.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Separate Read-Only Instruction Memory
32
Addr
Data
32
InstrMem Reads are combinational: Put a
stable address on input, a short time later data appears on output.
Not concerned about how programs are loaded into this memory.
Related to separate instruction and data caches in “real” designs.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Task #1: Straight-line Instruction Fetch
32
Addr
Data
32
InstrMem
Fetching straight-line MIPS instructions requires a machine that generates this timing diagram:
“Requirement
s”
Why +4 and not +1?
Why increment every cycle?
CLK
Addr
Data IMem[PC + 8]IMem[PC + 4]IMem[PC]
PC + 8PC + 4PC
PC == Program Counter, points to next instruction.
32-bit instructions.
Straight-line code.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
New Component: Register (for PC)
In later examples, we will add an “enable” input: clock edge updates state only if enable is high.
32Din
Clk
PC
Dout32
Built out of an array of flip-flops
D Q
clk
D Q
D Q
Din0
Din1
Din2
Dout0
Dout1
Dout2
How to design?
Mux Q back to D.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
New Component: A 32-bit adder (ALU)
Combinational: Put A and B values on inputs, a short time later A + B appears on output.
32+
32
32
A
B
A + B
32ALU
32
32
A
B
A op B
op
ln(#ops)ALU: Combinational part that is able to execute many functions of A and B (add, sub, and, or, ... ).The “op” value selects the function.
Equal?
Sometimes, extra outputs for use by control logic ...
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Design: Straight-line Instruction Fetch
Clk
32Addr Data
InstrMem
32D
PC
Q32
32
+
32
320x4
+4 in hexadecimal
State machine design in the service of an ISA
CLK
Addr
Data IMem[PC + 8]IMem[PC + 4]IMem[PC]
PC + 8PC + 4PC
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
InstructionFetch
InstructionDecode
OperandFetch
Execute
ResultStore
NextInstruction
Fetch next inst from memory:012A4020
opcode rs rt rd functshamtDecode fields to get : ADD $8 $9 $10
“Retrieve” register values: $9 $10
Add $9 to $10
Place this sum in $8
Prepare to fetch instruction that follows the ADD in the program.
Syntax: ADD $8 $9 $10 Semantics: $8 = $9 + $10
Goal #1: An R-format single-cycle CPU
Done!
To continue, we need registers ...
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
MIPS Register file: From the top down
R1
R2
...
R31
Why is R0 special?
Q
Q
Q
R0 - The constant 0 Q
clk
.
.
.
32MUX
32
32
sel(rs1)
5
.
.
.
rd1
32MUX
32
32
sel(rs2)
5
.
.
.
rd2
“two read ports”
D
D
D
En
En
En
DEMUX
.
.
.
sel(ws)5
WE
How do we add a second write port?
wd
32
Duplicate write buses, add muxes.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Register File Schematic Symbol
32rd1
RegFile
32rd2
WE32wd
5rs1
5rs2
5ws
Why do we need WE?
If we had a MIPS register file w/o WE, how could we work around it?
Advanced planning, for instructions that don’t write the register file.
Do writes to the hardwired-to-zero register R0.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
InstructionFetch
InstructionDecode
OperandFetch
Execute
ResultStore
NextInstruction
Fetch next inst from memory:012A4020
opcode rs rt rd functshamtDecode fields to get : ADD $8 $9 $10
“Retrieve” register values: $9 $10
Add $9 to $10
Place this sum in $8
Prepare to fetch instruction that follows the ADD in the program.
Syntax: ADD $8 $9 $10 Semantics: $8 = $9 + $10
Goal #1: An R-format single-cycle CPU
What do we do with these?
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Computing engine of the R-format CPU
32rd1
RegFile
32rd2
WE32wd
5rs1
5rs2
5ws
32ALU
32
32
op
opcode rs rt rd functshamt
Decode fields to get : ADD $8 $9 $10
Logic
What do we do with WE?
Hardwire to always write.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Putting it all together ...
32rd1
RegFile
32rd2
WE32wd
5rs1
5rs2
5ws
32ALU
32
32
op
LogicIs it safe to use same clock for PC and RegFile?
32Addr Data
InstrMem
32D
PC
Q32
32
+
32
320x4
To rs1,rs2, ws, op decodelogic ...
Yes!
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
D Q
CLK
Value of D is sampled on positive clock edge.Q outputs sampled value for rest of cycle.
D
Q
Recall: Our ideal-world D Flip-Flop
Also assume: clocks arrive at all flip flops simultaneously.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Reminder: How data flows after posedge
32rd1
RegFile
32rd2
WE32wd
5rs1
5rs2
5ws
32ALU
32
32
op
Logic
Addr Data
InstrMem
D
PC
Q+
0x4
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Next posedge: Update state and repeat
32rd1
RegFile
32rd2
WE32wd
5rs1
5rs2
5ws
D
PC
Q
In this ideal world, as long as the clock is slow enough, the machine gets the right answer.
In Metrics lecture,we look at theassumptions behind ideality.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Next Step ...
Design stand-alone machines for other major classes of instructions:immediates, branches, load/store.
Learn how to efficiently “merge” single-function machines to make one general-purpose machine.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Goal #2: add I-format ALU instructions
Syntax: ORI $8 $9 64 Semantics: $8 = $9 | 64
16-bit immediate extended to 32 bits.
In this example, $9 is rs and $8 is rt.
Zero-extend: 0x8000 ⇨ 0x00008000
Sign-extend: 0x8000 ⇨ 0xFFFF8000
Some MIPS instructions zero-extend immediate field, other instructions sign-
extend.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Computing engine of the I-format CPU
32rd1
RegFile
32rd2
WE32wd
5rs1
5rs2
5ws
32ALU
32
32
op
Decode fields to get : ORI $8 $9 64
Logic
In a Verilog implementation, what should we do with rs2?
Ext
Tie to the value that minimizes energy consumption.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Merging data paths ...
I-format
R-format
Where ?
How many ?(ignore ALU control)
Add muxes
N
N
N
2
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
The merged data path ...
32rd1
RegFile
32rd2
WE32wd
5rs1
5rs2
5ws
32ALU
32
32
op
opcode rs rt rd functshamt
RegDest
ALUsrc
Ext
ExtOp
ALUctr
If you watched it being designed, it’s understandable ...
UC Regents Spring 2014 © UCBCS 152: L2 Single-Cycle Wrap-up
Memory Instructions
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Loads, Stores, and Data Memory ...
32Dout
Data Memory
WE32Din
32Addr
Syntax: LW $1, 32($2)
Syntax: SW $3, 12($4) Action: $1 = M[$2 + 32] Action: M[$4 + 12] = $3
Writes are clocked: If WE is high, memory Addr captures Din on positive edge of clock.
Reads are combinational: Put a stable address on Addr,a short time later Dout is ready.
Note: Not a realistic main memory (DRAM) model ...
Zero-extend or sign-extend immediate field?
Sign-extend.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Adding data memory to the data path
32rd1
RegFile
32rd2
WE32wd
5rs1
5rs2
5ws
ExtRegDest
ALUsrcExtOp
ALUctr
MemToRegMemWr
Syntax: LW $1, 32($2)
Syntax: SW $3, 12($4) Action: $1 = M[$2 + 32] Action: M[$4 + 12] = $3
RegWr
Recall spec: no load delay slot.
UC Regents Spring 2014 © UCBCS 152: L2 Single-Cycle Wrap-up
Branch Instructions
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Conditional Branches in MIPS ...
Syntax: BEQ $1, $2, 12
Action: If ($1 != $2), PC = PC + 4
Zero-extend or sign-extend immediate field?
Action: If ($1 == $2), PC = PC + 4 + 48
Immediate field codes # words, not # bytes.Why is this encoding a good
idea?
Why is this extension method a good idea?
Increases branch range to 128 KB.
Supports forward and backward branches.
Sign-extend.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Adding branch testing to the data path
32rd1
RegFile
32rd2
WE32wd
5rs1
5rs2
5ws
ExtRegDest
ALUsrcExtOp
ALUctr
MemToRegMemWr
Syntax: BEQ $1, $2, 12Action: If ($1 != $2), PC = PC + 4Action: If ($1 == $2), PC = PC + 4 + 48
Equal (wire into control)
RegWr
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Recall: Straight-line Instruction Fetch
32
Addr
Data
32
InstrMem Fetching straight-line MIPS
instructions requires a machine that generates this timing diagram:
CLK
Addr
Data IMem[PC + 8]IMem[PC + 4]IMem[PC]
PC + 8PC + 4PC
PC == Program Counter, points to next instruction.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Recall: Straight-line Instruction Fetch
CLK
Addr
Data IMem[PC + 8]IMem[PC + 4]IMem[PC]
PC + 8PC + 4PC
Clk
32Addr Data
InstrMem
32D
PC
Q32
32
+
32
320x4
Syntax: BEQ $1, $2, 12Action: If ($1 != $2), PC = PC + 4Action: If ($1 == $2), PC = PC + 4 + 48
How do we add this behavior ?
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Design: Instruction Fetch with Branch
Clk
32Addr Data
InstrMem
32D
PC
Q
32
32+
32
32
0x4
Syntax: BEQ $1, $2, 12Action: If ($1 != $2), PC = PC + 4Action: If ($1 == $2), PC = PC + 4 + 48
PCSrc
32
+32
Extend
UC Regents Spring 2014 © UCBCS 152: L2 Single-Cycle Wrap-up
Single-Cycle Control
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
What is single cycle control?
32rd1
RegFile
32rd2
WE32wd
5rs1
5rs2
5ws
ExtRegDest
ALUsrcExtOp
ALUctr
MemToRegMemWr
Equal
RegWr
32Addr Data
InstrMem
Equal
RegDestRegWr
ExtOpALUsrc MemWr
MemToReg
PCSrc
Combinational Logic(Only Gates, No Flip Flops)
Just specify logic functions!
rs,rt,rd,imm
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Two goals when specifying control logic
Bug-free: One “0” that should be a “1” in the control logic function breaks contract with the programmer.
Efficient: Logic function specification should map to hardware with good performance properties: fast, small, low power, etc.
Should be easy for humans to read and understand: sensible
signal names, symbolic constants ...
time machine back to FPGA-oriented 2006 CS 152 ...
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Advice: Carefully written Verilog will yield identical semantics in ModelSim and Synplicity. If you write your code in this way, many “works in Modelsim but not on Xilinx” issues disappear.
In practice: Use behavioral Verilog
Always check log files, and inspect output tools produce!
Look for tell-tale Synplicity “warnings and errors” messages !
“latch generated”, “combinational loop detected”, etc
Automate with scripts if possible.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
F06 152 Labs: A small subset of MIPS ...
What if some other instruction appears in the instruction stream?
For labs: undefined.
Real world: exceptions.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Why not in labs? Doubles complexity!
Components in blue handle exceptions ...Will cover this (pipelined CPU) example later in the term ...
Components in blue handle exceptions ...Will cover this (pipelined CPU) example later in the term ...
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
A slide from Eric’s section on 1/22 ...
Actually, I agree ...However ... if you aren’t able to
design at the level we just worked through, you will unwitting propose unbuildable ideas.... and lose the confidence of your fellow team members.
Upcoming 2014 Lab 1 ...
If you understoo
d this lecture,
you nowhave the
conceptual foundation to modify
this design.
RISC-V Single-Cycle CPU
Written in Chisel
... if you’re willing to spend a few days to teach yourself
Chisel.
UC Regents Spring 2014 © UCBCS 152: L2 Single-Cycle Wrap-up
Break
Play:
UC Regents Spring 2014 © UCBCS 152: L2 Single-Cycle Wrap-up
VLIW
VeryLongInstructionWords
Josh Fisher: idea grew out of his Ph.D (1979) in compilers
Led to a startup
(MultiFlow) whose
computers worked, but
which went out of business ...
the ideas remain
influential.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Basic Idea: Super-sized Instructions
Example: All instructions are 64-bit. Each instruction consists of two 32-bit MIPS instructions, that execute in parallel.
opcode rs rt rd functshamt
opcode rs rt rd functshamt
Syntax: ADD $8 $9 $10 Semantics:$8 = $9 + $10
Syntax: ADD $7 $8 $9 Semantics:$7 = $8 + $9
A 64-bit VLIW instruction
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
VLIW Assembly Syntax ...
Instr: ADD $8 $9 $10 ADD $7 $8 $9
Denotes start of an instruction word. Listed operators all
execute in parallel.
Instr: SUB $2 $3 $0 OR $1 $5 $4 Execute in
parallel.
Label: AND $5 $2 $3 OR $1 $5 $4
[...]
Branch label name instead of default “instr”.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
ADD $8 $9 $10; Result: $8 = 19
ADD $7 $8 $9; Result: $7 = 28
32-bit MIPS:
Assume: $7 = 7, $8 = 8, $9 = 9, $10 = 10 (decimal)
VLIW:
Instr: ADD $8 $9 $10 ; result $8 = 19ADD $7 $8 $9 ; result $7 = 17 (not 28)
32-bit & 64-bit semantics different? Yes!
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Design: A 64-bit VLIW R-format CPU
No loads or stores: machine has no use for data memory, only instruction memory.
No branches or jumps: machine only runs straight line code.
opcode rs rt rd functshamt
opcode rs rt rd functshamt
Syntax: ADD $8 $9 $10 Semantics:$8 = $9 + $10
Syntax: ADD $7 $8 $9 Semantics:$7 = $8 + $9
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
VLIW: Straight-line Instruction Fetch
Clk
Addr Data
InstrMem
32D
PC
Q32
32
+
32
32
CLK
Addr
Data IMem[PC + 16]IMem[PC + 8]IMem[PC]
PC + 16PC + 8PC
64
0x8
+8 in hexadecimal -- 64 bit instructions
Simple changes to support 64-bit instructions ...
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Computing engine of VLIW R-format CPU
opcode rs rt rd functshamt
opcode rs rt rd functshamt
32ALU
32
32
op
32ALU
32
32
op
32rd1
RegFile
32rd2
WE1
32wd1
5rs1
5rs2
5ws1
WE2
32rd3
32rd4
5rs3
5rs4
32 wd2
5 ws2
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
What have we gained with 64-bit VLIW?
opcode rs rt rd functshamt
opcode rs rt rd functshamt
Syntax: ADD $8 $9 $10 Semantics:$8 = $9 + $10
Syntax: ADD $7 $8 $9 Semantics:$7 = $8 + $9
If:Clock speed remains the same.
All 32-bit operators do useful work.
Performance doubles!
N x 32-bit VLIW yields factor of N speedup! Multiflow: N = 7, 14, or 28 (3 CPUs in product family)
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
What does N = 14 assembly look like?
Two instructions
from a scientific
benchmark (Linpack) for
a MultiFlow CPU with
14 operations per
instruction.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
What have we gained with 64-bit VLIW?
opcode rs rt rd functshamt
opcode rs rt rd functshamt
Syntax: ADD $8 $9 $10 Semantics:$8 = $9 + $10
Syntax: ADD $7 $8 $9 Semantics:$7 = $8 + $9
If:Clock speed remains the same
All 32-bit operators do useful work.
Performance doubles!
N x 32-bit VLIW yields factor of N speedup! Multiflow: N = 7, 14, or 28 (3 CPUs in product family)
A very big “if” !
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
As N scales, HW and SW needs conflict
Instruction Set Architecture: Where the conflict plays out.
I/O systemProcessor
Digital DesignCircuit Design
Datapath & Control
Transistors
MemoryHardware
CompilerOperating
System(Mac OS X)
Application (iTunes)
Software Assembler
Hardware need: Clock does not slow down.
Software need: All operators do useful work.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Example problem: Register file ports ...
32ALU
32
32
op
32ALU
32
32
op
32rd1
RegFile
32rd2
WE1
32wd1
5rs1
5rs2
5ws1
WE2
32rd3
32rd4
5rs3
5rs4
32 wd2
5 ws2
N ALUs require 2*N read ports and N write ports. Why is this a
problem?
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Recall: Register File Design
R1
R2
...
R31
Q
Q
Q
R0 - The constant 0 Q
clk
.
.
.
.
.
32MUX
32
32
sel(rs1)
5
.
.
.
rd1
32MUX
32
32
sel(rs2)
5
.
.
.
rd2
D
D
D
En
En
En
DEMUX
.
.
.
sel(ws)5
WE
wd32
More read ports increases fanout, slows down reads.
More write ports adds data muxes, demux OR tree.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Split register files: A solution?
32ALU
32
32
op
32ALU
32
32
op
32rd1
RegFile
32rd2
WE32wd
5rs1
5rs2
5ws
32rd1
RegFile
32rd2
WE32wd
5rs1
5rs2
5ws
Too often, the data an ALU needs to do “useful work” will not be in its own
regfile.
Software need: All operators do useful work.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Architect’s job: Find a good compromise
Instruction Set Architecture: Where the conflict plays out.
I/O systemProcessor
Digital DesignCircuit Design
Datapath & Control
Transistors
MemoryHardware
CompilerOperating
SystemSoftware Assembler
Application
Example solution: Split register files, with a dedicated bus and special instructions for moves between regfiles. Mayhurt software more than it helpshardware :-(
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Branch policy: All instr operators execute
opcode rs rt rd functshamt
opcode rs rt rd functshamt
BNE $8 $9 Label ADD $7 $8 $9
Problem: Large N machines find it hard to fill all operators with useful work.
ADD executes if branch is taken or not taken.
Solution: New “predication” operator.Syntax: SELECT $7 $8 $9 $10
Semantics: If $8 == 0, $7 = $10, else $7 = $9
Permits simple branches to be converted to inline code.
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Branch nesting in a single instruction ...
opcode rs rt rd functshamt
opcode rs rt rd functshamt
BEQ $8 $9 LabelOne
Conundrum: How to define the semantics of multiple branches in one instruction?
BEQ $11 $12 LabelTwo
MultiFlow: N-way Branch priority set in an opcode field.
Solution: Nested branch semanticsIf $8 == $9, branch to LabelOne
Else $11 == $12, branch to LabelTwo
UC Regents Spring 2014 © UCBCS 152: Single-Cycle Design
Will return to VLIW later in semester ...
Next Tuesday
... and if we have time, we’ll discuss microcode (on class website, click on link
for reading PDF).
How to measure the “goodness” of an architecture (and an implementation) ...
Have a good weekend!