Upload
vomien
View
213
Download
0
Embed Size (px)
Citation preview
Status and some history
Ron MinnichLA-UR-06-6779
Origins
● LinuxBIOS began life as a way to fix problems with supercomputer clusters
● Goal was to have a BIOS that only knew how to load Linux; nothing more
● Started in 1999● First real system working in Spring 2000
Some pictures from then
And ... the first boot
So we had one node in 2000? No, by November we had 14!
A Linux Labs system at SC 2000
And in 2001, we had 100 ...
That 2001 cluster was 96 Alpha systems running LinuxBIOS
● Port courtesy of Eric Biederman at Linux NetworX
● So in 2001, we had two important trends– Involvement of several Linux vendors
– Crossed the 100-node mark
– We went multi-architecture (note: we were not the first)
– We went 64-bit (and 64-bit clean)
● And then, in 2002:
Thousands of nodes in 2002
In 2002, several thousand cluster nodes were built using
LinuxBIOS● By 2003, the number topped 10,000● By 2004, growth was in this “slow but
steady” mode● Then something interesting happened
Supercomputers? Who cares? We're building TV boxes!
● At some point, in '05, a company built over 1 MILLION nodes using LinuxBIOS
● At a stroke, this new use of LinuxBIOS dwarfed all other uses
● And pointed to an interesting trend ...
Supercomputing: units of $1MEmbedded: units of $.01
● There was a strong argument for LinuxBIOS in supercomputing
● Lower capital and maintenance costs● Easier to maintain● Did a better job than other BIOSes● But in terms of per-board cost, it had little
effect● It was really only benefiting the customer● But the vendor paid the BIOS price
anyway, then threw the BIOS away ...
Embedded: pennies are a lot!
● But in embedded, it was even simpler● LinuxBIOS saves a LOT of money per
board (when the board is really cheap)● Can be customized in all sorts of ways● Bug fix turnaround time is very short● Overall, can save several million dollars
over a production run● “How can we not use it?”
Recent call from an embedded company
● Nobody wants Windows(TM), our customers all run Linux
● We're paying a lot of $$$ per unit to support an OS our customers neither want nor need
● How soon can we get LinuxBIOS going? ● And then the really cool part
– It's going
– Somebody else (not us) did the port
– And gave it back to the tree
Why do companies give back?
● NOT the real reason: – It's required to give it (on request) to customers
● The real reason– Support team for LinuxBIOS is large, skilled, open
– Free is not the issue
– The issue is that there are so many good people, support needs get addressed very quickly
– More quickly than these companies are used to
Economics at scale
● So giving back has a $$$ benefit● Open source is a negative cost● This is what we should work toward● We can take the next step to working on a
large scale
And then came OLPC
● OLPC had certain needs that could not be fulfilled by commercial BIOSes
● Instant-on● sleep/resume 100s of times a second
– ACPI was not going to do it
● Need highly capable USB stack – as good as Linux
● Need IPV6 in the BIOS● All this stuff is easy – if Linux is your BIOS
So what's that growth look like?
2000 2001 2002 2003 2004 2005 2006 20070
1000000
2000000
3000000
4000000
5000000
6000000
7000000
8000000
9000000
10000000
LinuxBIOS units shipped, 2000-2007
# nodes
To put it another way
● All the growth we accomplished in the early years is as nothing compared to OLPC
● But OLPC would not have been possible absent that earlier work
● OLPC BIOS took about 2-3 months (from scratch) to get to booting Linux
● Only took that long because the CPU and chipset are very complex – about 10x more register settings than comparable systems
Other interesting developments
● People are starting to put interesting parts into the Opteron sockets
● Example: replace Opteron with FPGA● Conventional BIOSes can not handle this
type of system● LinuxBIOS can – it is shipping now in
these systems● Here is a clear case where there was no
option to LinuxBIOS
Other very important developments in 2006
● 2006 is turning out to be one of the important years
● We've perfected the technology (demo tomorrow) to build a BIOS automatically– With embedded linux, initrd
– Embedded busybox
– Get a nice ash boot prompt
– Or pretty pictures – demo later
● Result: LinuxBIOS has a very clear “value proposition”
LinuxBIOS “value proposition”● For products where:
– Windows(tm) is not needed, or
– Unit cost is important, or
– You want the Open Source negative cost benefit, or
– Existing mechanisms (e.g. ACPI) are unable to help, or
– Companies want to use Linux drivers in BIOS, or
– The power of Linux is needed at the BIOS level, or
– Quick turnaround on support issues is essential
● LinuxBIOS is the most cost-effective solution
Things are going to change
● I am seeing a real change in companies● Before, it was “we'll do LinuxBIOS because
you need it”● Now, it is “We're doing LinuxBIOS because
we need it to survive”● The argument at some places is not
whether, or how, but when● I think we are on the verge of critical mass
We're showing● That Linux-as-BIOS works● That it can be very easy to use and set up
– Lots of “naive users” are installing LinuxBIOS today on OLPC, thanks to tools you'll see later
● That it makes business sense● That, in short, it is the most sensible way to
build systems that run Linux– We'll save the other Oses for later, but help is
on the way
– Note that we have a large cluster booting Plan 9 under LinuxBIOS
OLPC
You've seen it
I could not get Jim's slides so am going to tell you all I recall :-)
● Ask questions!● Keyboard● Display● Size● Processor● Motherboard● Now for the fun
The BIOS needs:● Really fast boot● Fast suspend/resume (10 ms)
– No way ACPI can do this
● Hackable – yep, another goal is that kids be able to remake the BIOS if they wish
● Secure BIOS update – yep, people are figuring out new ways to ensure that the wrong BIOS does not get loaded up
● This will be much more secure than today's commercial “security through obscurity”
Why this CPU is hard
● The Geode series is 2 levels of software● The part you normally see● The hidden part that implements PC
hardware that does not exist● This is implemented via system
management interrupts● So it is impossible to operate this CPU
without an SMI handler
The “Virtual PC”There's no hardware here!
LinuxBIOS
Magic software(lots of it)
that decodes the operationand emulates it
outb(0xcfc, 0x80000010SMI TRAP! Output value
stored in data structure
A complex CPU● There is a very complex interconnection
network with 8-port routers to access resources● How does routing occur? ● In the MSR space● Top 18 bits of MSR is 6 each of 3 bits of route● So you potentially have 2^18 locations for 2^14
MSRs● And, boy, there are a LOT of MSRs!● Geode has 10x as many registers to set as any
other platform
And, as mentioned, no config space
● There is no config space as we know it● The config space accesses are managed
by an SMI trap handler● So it is literally impossible to bring up this
machine without the Geode SMI handler● The software which handles all this work is
a small message-passing micro-kernel called VSA
● Yes, a micro-kernel!
VSA
● VSA is the code that runs as an SMI handler● This was formerly closed-source● But, last spring, AMD released the source● And also announced that they would help
those who needed help with BIOS development
● So the source is available● But only compiles under no-longer-available
versions of masm and mfc
VSA
● Is configurable too● For OLPC, we went with the minimal
configuration● Just enough to support graphics, and PCI
config● How do we set up VSA? ● Pretty much the same way we set up VGA
BIOS in older days
Setting up VSA
● VSA is held (nrv2b-compressed) in the FLASH part at the second 64k segment
● LinuxBIOS copies it to memory● Then sets two critical registers (MSRs,
what else?)● Then pops to real mode and calls VSA,
which installs itself as an SMI handler● It's a good thing we wrote that realmode
code ca. 2001 :-)
Other issues
● The operating flash is a serial FLASH part called SPI
● Which is run by a keyboard controller– No power-up without this code
● Which needs 64K of FLASH space● The keyboard controller code is about 2x
the size of LinuxBIOS● So the overall layout is 64K keyboard, 64K
VSA, the rest to LinuxBIOS
What goes in the LinuxBIOS part?
● Well, we wanted USB, wireless, NAND flash support
● Did we really want to implement this ourselves?
● Answer: no● Early decision: Linux● So we decided to run with LinuxBIOS, and
a very tightly configured Linux kernel/initrd
What's interesting about this decision
● It takes us back to where we started in 1999
● The original goal, in 1999, was to make Linux the BIOS
● In part, so we could exploit good drivers instead of the broken ones found in most BIOSes
● So the OLPC decision was actually taking us back to our roots
Making it fit
● The trick was to make it fit● We accomplished this several ways● Nrv2b compression of the VSA● Different strategies for compressing the
kernel/initrd
Compressing the kernel/initrd● First try: use mkelfimage to concatenate
bzimage + gzip -9 initrd● Second try: mkelfimage with vmlinux and initrd,
and nrv2b compress that● Third try: lzma instead of nrv2b● In the end, with a kernel with
– Nand flash, ext2, jffs2, all USB
● We had almost 256K left!● But still need ipv6, and wireless ..● Can we fit it?
Final ROM layout
● EC (keyboard controller) : 64K● VSA: 64K (soon to be 32K with LZMA)● Linux Kernel, initrd, lzma compressed:
– 1M-128K-32K
● LinuxBIOS: 32K
So what's this all look like?
● Demo time!
How do you build it?● At first, it was awful: had to build :
– Busybox
– Uclibc
– Kernel
– Linuxbios
● Seperately, then cat together with vsa, and EC
● Jordan Crouse of AMD set up a tool called buildrom, which we have extended
● Now how to you build it? DEMO TIME!
Conclusions
● OLPC is breaking new ground– LinuxBIOS on a huge scale
– Embedding Linux in a BIOS flash on commercial product
– Busybox command line environment for problems
– Good-looking, professional-looking boot screen
– A build environment that anybody can use
– Dealing with production-level issues
buildrom
● Git-clone git://dev.laptop.org/users/jcrouse/buildrom
LinuxBIOS V3
Ron Minnich
Note: Anything I say can and will be used against LinuxBIOS
● I deal with this from time to time – people who don't like open source and just make things up
● So, a little note: go to tianocore.org, – Download it - all 12 MB of it
– Then try to figure it all out
– Then try to do anything with it
– Then realize that it supports 0 motherboards, and LB supports dozens
● At its hardest, LinuxBIOS is nowhere near as complex as Tiano, and it is far more capable
e.g.: tiano/EDK1.00/Foundation/Core/Dxe/Dispatcher/Dispatcher.c
VOID
EFIAPI
CoreFwVolEventProtocolNotify (
IN EFI_EVENT Event,
IN VOID *Context
);
EFI_DEVICE_PATH_PROTOCOL *
CoreFvToDevicePath (
IN EFI_FIRMWARE_VOLUME_PROTOCOL *Fv,
IN EFI_HANDLE FvHandle,
IN EFI_GUID *DriverName
);
So, yes, we will be critical of LinuxBIOS here
● But let's not lose our perspective!● There will be 10 million LinuxBIOS nodes
shipped in 2007● LinuxBIOS has seen exponential growth since
2000– 10^(year-2000)
● There are lots of companies and users● It's hard, and will be improved, but it also works● And it's not hard for everyone – some people just
“get it” after a few days
Overview
● What V2 does well● What it should do better● What we want to change● Proposed structure● Discussion● Conclusions
What V2 does well
● Object-oriented structure– Everything is an object with same type of
control points
● Has incredible flexibility, shown time and time again– Most recently on OLPC, where we had all
kinds of special cases
– Simple example: need to change top-level enumeration due to weird GX2 structure
– Can't do enumeration in normal way:
OLPC Example: pci_domain_set_resources
● See src/northbridge/amd/gx2/northbridge.c● All northbridges to date: use the standard
library function, which enables self, then children
● GX2 “VSA” BIOS made this impossible● The fix? Take the standard one, comment
out a bit, use the rest● Fixed in 1 minute
In other words:
static struct device_operations pci_domain_ops = {
.read_resources = pci_domain_read_resources,
.set_resources = pci_domain_set_resources,
.enable_resources = enable_childrens_resources,
.init = 0,
.scan_bus = pci_domain_scan_bus,
};
This is the key
That pci_domain_set_resources function
● Enables self, then children● But enabling self caused “issues”● The fix: create a private version: static void pci_domain_set_resources(device_t dev)
{
assign_resources(&dev->link[0]);
}
● In that same file; done
So, easy, right?
● Yes: if you know● No: if you don't● And that's one of the problems: this is
tricky code, and people sometimes have trouble with it
The other thing V2 does well
● It does not do more than it should– No timers, tasks, dynamic modules, file
systems,...
● It lets Linux do the heavy lifting, in most cases
● That we will preserve● And we will preserve the config tool – in
some form
The config tool● This tool confuses many people● What does it do? ● Config tool allows you to take a disparate
set of parts and create a mainboard● Allows you to specify different control
parameters for each part – e.g. If you have many southbridges of same type, and they are all different
● Generates the complex tree of initialized C structs that constitutes the 'static device tree' for the board
The static device tree
● It is a pre-initialized set of structures that defines the parts, their initial state, and their topology
● It is modified at runtime– Parts are dynamically discovered and added
– Parts can be disabled if they don't exist (think of CPUs in an SMP)
● It is pretty complex!● Hard to imagine setting it up manually
(we've tried)
Config tool (cont.)
● Config tool can generate a hierarchical build directory,with multiple BIOS instances, for one mainboard
● Each part can have its own Config file● Detects over-usage of variables – i.e. you
only get to set things once – there is a reason
Config tool – why?
● It is the result of a lot of learning● In the beginning – to build, you set up a
directory with symlinks – and then the BIOS got build with a simple command– You never knew what was was
● Then we had a makefile for each board and type– e.g. Make -f RONS_EPIA
● Then we had the V1 config tool
V1 config tool
● Modeled on BSD config tool● On BSD you do this:
– Config xyz
– Cd ../xyz; make
● And you get a kernel● Source tree does not get contaminated
with lots of .o's● Linux is learning how to do this – I hear it
almost works
V1 config tool
● Very simple two and three-part commands– e.g. Northbridge amd/gx2
● No nested structure– The old PCs were hardwired a certain way
● No ability to have two parts of same type● No real conditionals – these came later as
a hack
V1 config tool
● K8 broke it bigtime● K8 had lots of parts of same type● We tried, valiantly, to do stuff like
– Northbridge amd/k8
● And then create lots of variables ● It was impossible● We had to kill the tool
V2 config tool
● Avoid problems seen in V1– Repeated setting/clearing of variables – set
once
– Allow for specification of parts and control of their bits
– Allow some math (e.g. 512*1024)
● The question is, if we want to ditch the config language, how do we do this in Kconfig?
Discussion: how to config?
In fact, what should we preserve?
● Minimal assembly code● “Let Linux do it”● No callbacks
– This is a security issue as well
– BIOS should never run once the OS is up
– And, yes, ACPI is a security issue!
● Support for different types of configuration for single motherboard – i.e. Multiple targets
● The structure of the tree for the most part: type/vendor/partname is easy to think about
What we want to improve
● Documentation (a perennial problem)● Ease of building LinuxBIOS● Remove ld 'tricks' – move to one simple ldscript● Config system, which is also very powerful, but
not like (e.g.) linux kconfig● romcc/gcc separation
– And the fact that a full C compiler is part of LinuxBIOS
– Remove our C compiler
● Jump to 64-bit mode immediately if possible
Improvements (cont.)
● Assembly code structure– Make it small
● Assembly include structure has incredible flexibility– More than we needed
– It's too flexible, and that makes it hard for people to follow
● Allow any C99 compiler to compile, and build, linuxbios – Remove gcc tricks, in other words
Other goals● One assembly file with no includes (doable)● Cache-as-RAM only
– Eliminates ROMCC
● Kconfig● One-stop build environment for everything
– Believe it or not, this is done by OLPC, so we can take it
● Simplify device model if possible– Have to be careful here; don't want to lose the
capability
The bigger question: what's the build model?
● Current LinuxBIOS build model is based on “perpetual care” -- the OS model
● Once a motherboard is supported, we support it forever
● How realistic is this? ● It turns out, not very, because:
– Motherboards have a very short life
– They are constantly changing
– Chips change out from under you
● So, the build model is betrayed by hardware
Simple motherboard example● Taken from real life● Board A works, we port to it● Board B comes along, it is “just like” A, same
chipset (same part #s)– And A is no longer made; we don't have any
– And B has a bugfixed chipset
● We extend LB to support B, add software fix for bug fix
● We find out (sometimes a year later), we broke A, and it is impossible to find out why
Simple build example● We have working motherboard A for gcc 3.x● 'A' goes out of use
– We can no longer test; we have none
● We upgrade (we have to) to gcc 4.1● We later find out that, due to gcc issues, 'A'
has stopped working● This is messy stuff; even Linux has trouble
with it– Just try to build Linux 2.0 sometime with recent
toolchain (I have); it won't work
A few lessons from this● From the point of view of mainboard 'A', the
code base is, at some point, going to be incompatible– i.e. Once 'A' works, it is frozen in time
● As time goes on, everything is a cross compilation
● Specification of build environment needs to include toolchain version– Gcc just changes too much
– We've had lots of problems over the years traced to gcc/gld modifications (and just had one on OLPC!)
Why doesn't the OS model work for a BIOS?
● Oses are designed for a type of machine – e.g. a “PC”
● BIOS is designed for a specific instance of a machine – e.g. “VIA Epia ML10005.5”
● The “PC” has been around for 25 years● An instance of a “PC” may be around for
25 weeks – 50x higher change rate ● A difference that has huge impact on BIOS
may not even be visible to the OS● That's why BIOSes are so much harder
What are the implications of using the OS model for BIOS?
● We've gone to great effort in LinuxBIOS to make the code the union of all code for supported motherboards
● That explains the complex include, ldscript, and some other points of confusion in the code
● But that's an OS model – and we already know OS model is not the right model for BIOS
A different model: “copy and convert”
● This model has been used very effectively in Plan 9
● It stands in stark contrast to current practice in Linux, *BSD, etc. (and, hence, LinuxBIOS)
● The development model looks like a continuing series of “microforks” -- since it's only one or two files at each point
● So, e.g., ppc 405 and ppc440 have totally different build directories, with different l.s, not the #ifdef complexity you see in Linux
How would “copy and convert” affect LinuxBIOS
● Example: crt0.s● Right now crt0.s is a sum of confusing parts
which can do anything for any x86 cpu– And that 'do anything' is a big problem
● The new idea is that we have a common crt0.s, for, e.g., 386, all in one file, that will work for most anything
● But if your platform is “special”, you “copy and convert” crt0.s – you fork it – for your platform– This is ok! Because in 6 months, your platform is no
longer made
One way this might work: “fire and forget”
● Set up a build for a motherboard● Make it work● But don't try to make it work forever● This is how we're doing OLPC
– Svn makes it easy
● With every OLPC build comes the version number of the working LinuxBIOS version– A specification of how to build it
● We need to tighten these specs – gcc version comes to mind
Taking “fire and forget” to its extremes
● One possibility – the “check out” model● You have the source tree and you wish to
build OLPC● You type 'lbget olpc/rev_a /tmp/buildit' and
it pulls all the bits needed into /tmp/buildit● /tmp/buildit looks like this: ● /tmp/buildit/src (all .c, .s),
/tmp/buildit/include (all .h), /tmp/buildit/rom, /tmp/buildit/Makefile
Note:
● This 'checkout' model died a quick and painless death in discussion
What is going on here? ● All source is in /tmp/buildit – not the tree● Includes too● Cd /tmp/buildit; make – you have a BIOS image● You can then diverge from the LB tree as
needed– You are standalone at this point!
● And this is another point: forks happen● Given that a motherboard is obsolete in a few
months, does this matter? ● How many files would this be?
That's just one model
● It's possible that a simple set of changes is best
● Preserve what we have, but make a few mods● Single l.s● Design for cache-as-ram for all boards
– Remove romcc
● Remove all ldscripts that can be removed● Make the object model simpler to understand
(how to do this one?)
So what does this source tree look like?
● It won't actually change that much● You'll have a 386 crt0.s● And then for special cases, a mainboard
crt0.s● It will NOT be full of #includes● auto.c will go away since romcc is going
away● We can use YH Lu's work for the cache-
as-ram
These are actually big changes
● Remove romcc● Remove a lot of build related to l.s● Remove ld scripts that generate initialized
structures and replace with generated C code (Greg Watson has already done this as a test – it can work)
How the source tree might look for (e.g) OLPC
● src/mainboard/olpc/rev_a/crt0.s– Copied and adapted as needed
● src/mainboard/olpc/rev_a/ldscript.ld– What if this breaks?
– You have a known-good version #
– You figure out what happened and fix it
● Will some mainboards break over time?– Yes, the obsolete ones; but note, this happens
with gcc anyway!
Further changes● People keep asking if we can completely
eliminate config language and tool– Can we do this? As I said, I'm not sure
– But there is a feeling we can
● Config tool's main purpose is to create:– static tree (but that's a mainboard item)
– The Makefiles (but Linux 2.6 shows another way)
● Linux config ideas have been “bleeding” into LinuxBIOS anyway: – Payload-${1} for example
Discussion
● Whiteboard discussion. ● What do you think?● What would you most like to see change?● I'm taking notes ...
Note:
● Discussion was vigorous, and resulted in a new design. The design document will be released in November.
●