I wasn't sure where to post this, so I figured general was a good place. I been a Bit reader for a while and have worked with PCs as a hobby for years. After college, I had the opportunity to get involved with BIOS development and I jumped at it. In part, I have Bit to thank for that, so here is some payback I am reluctant to present my credentials. The industry is fairly small, and I am concerned about touching a nerve with my employer as I am an engineer, not a PR person. I also am concerned about what future employers might think as Google makes old stuff relatively easy to find. So you will just have to live with my paranoia, sorry. My next disclaimer is that BIOS is not what I work on daily. What I mean is good old fashioned, pure ASM, 16 bit IBM PC BIOS. I work with a different type of PC firmware system. Still, BIOS training was part of my job, I have had to do some work with it, and there are similarities BIOS and what I work on now. So I think I will be sufficient. Anyone else from the BIOS industry can feel free to correct whatever I get wrong, or just not quite right. I’m trying not to let anything slip past me, but I am human and make mistakes. For everyone else, have fun with any grammar mistakes spelling errors, or typos that slipped by me while writing this. For hex numbers, I am using C notation as I think that will be the most common. I don’t think I have missed any, I have proof read this a few times, but generally, I suck at proofreading my own stuff. The tech talk can get pretty thick very quickly. I had been working with PCs as a hobby for a while, and taken PC architecture courses as part of my degree, and I still had a lot to learn. If something isn’t clear, I can try and clear it up in the discussion. BIOS stands for Basic Input Output System, and it is an old, archaic beast that has a lot to it. Most of it isn’t seen unless you are looking really hard for it. IMHO, transparency should be a BIOS design goal. The roots of BIOS are with IBM and the PC. In the IBM PC Technical Reference Manual, AKA, “the purple book”, IBM included BIOS source code for their PCs. For years, this was the ‘standard’ BIOS was based on (and still is to some degree). There is no standards body for BIOS, no one coordinating the industry. It has been purely market driven. Before getting going, let me clear something up. That the screen you need to frantically mash some special key to get to is not the BIOS. There have been cases where systems live happily without that bit of software at all. That piece of software is the BIOS configuration utility and is just the tip of the iceberg. To get started on the tech stuff, I need to explain is segment/offset memory addressing of the 16 bit real mode of the processor. The 8088 could address 1 megabyte of RAM, for which 20 bits of address lines are used. However, instead of a single register, the processor used 2 16 bit registers to generate the address. The first register is called the segment register, the second is the offset. To generate the absolute physical address in the RAM address space the processor is looking out, you take the segment, multiply it by 16, and add the offset to that. It’s far less complicated in hexadecimal (hex) than it sounds. That multiplication by 16 in hex works in the same was as multiplying by 10 in our normal decimal system, you just add a zero. Using C notation for hex numbers, a segment/offset pair for an address is written as follows: 0xF000:0x1337 The address being referred to is then calculated as 0xF0000+0x00D5=0xF00D5. But that segment register can be any 16 bit hex number so segments can overlap. This can actually be a useful feature for carving up a single 64K segment into smaller chunks. In the IBM PC architecture, the segments above 640K mark (the 0xA000 segment and above) are reserved for hardware such as video, disk controllers and BIOS. For example, 0xA000 and 0xB000 are used for video cards, 0xC000 for video and disk controller BIOSes, and 0xF000 for the system BIOS. The 0xE000 segment is a sort of special area that has done a number of things for the system. It has served as a spill over zone for BIOS, add in card BIOS, and has also served as the EMM window when that technology was first introduced. When the processor comes out a reset, which happens for both resets and when the system is powered on, it is hard wired to start executing code at 0xF000:0xFFF0. This is the processor’s reset vector. At this point in time, the processor can be thought of as being functionally the same as an 8088. What this means is that BIOS can’t use PCI at this point, and that knocks out a lot of things like USB, video, RAID adaptors, network interfaces, etc. All that can be assumed is that x86 I/O port can be accessed, and there is one meg of address space. Everything else will need some amount of work to use. That does not mean the BIOS can assume 1 meg of RAM is really present. Because the RAM is on modules that can be inserted or removed, the BIOS must test fro RAM and its size. I know it is impossible to get a memory module under 1 meg now, but we still need to check if any RAM is present in the system. Also on the to-do list is to update the processor’s microcode if that needs to done shortly after a reset. So, BIOS starts by loading microcode into the processor if it is needed, discovering information about the memory installed into the system, and setting up the memory controller appropriately. Depending on the chipset, there may be a few other things to do as well, but all this usually falls under the umbrella of what is called reference code. It is source and binaries provided by the processor and chipset manufacturers for the BIOS to run. Once that is done, the processor should be fully functional with all its features turned on, and the system now has usable RAM. At this point BIOS can stop making some of the 8088 architecture assumptions too. At some point after this, the system will be put into a sort of hybrid mode called flat mode, big real mode, unreal mode, or any of the other names it goes by. In this mode, the system can address all 4GB of memory address space though programs are still limited to executing code from the lower 1MB of the memory space. Still, data can now be placed anywhere in the address space. The BIOS will now change how the 0xF000 segment works. It will be shadowed meaning 64K is taken from RAM instead of the address range being mapped to a ROM region on the flash part. The BIOS will also be able to access the entire BIOS flash part that the chipset has mapped to a memory location. Usually this is at the top of the address space (so 0xFFFFFFFF minus the size of the flash part). The main parts of the BIOS are copied into 0xF000 with other parts of the BIOS loaded into other regions of memory set aside for BIOS use. Now all the ‘heavy lifting’ begins. First, there are a number of chipset devices that need to get configured. One example from an Intel chipset is what is called the Root Complex Base Address (RCBA). These devices are needed to access the rest of the chipset, but often involve memory mapped IO ranges so they need the memory controller to be set up first. Once these devices are configured, then PCI can be enumerated. Much of this process is in the PCI specification IIRC, so check it out if you can find a copy. This is a fairly involved process. First, the BIOS needs to find all the PCI bridges in the system, learn how they are connected to each other (it’s usually a tree, but there may be more than one root bridge in some cases), then it assigned bus numbers to the bridges. The bus number forms the first part of the address system used to communicate to a PCI device. The rest is made up the device and function numbers that are either hard wired for on chip devices, or based on the slot the card in plugged into. It is customary to refer to a device and collection of functions as just a PCI device since a PCI device must have at least one function, and additional functions are still part of that device. PCI devices use resources in the form of memory and IO port addresses. It may consume one or both types of these resources. In addition, the device also reports capabilities such as power management capabilities, an extended bus attributes in the case of PCI-X or PCI express that the BIOS needs to pay attention to. The BIOS collects this information by looking to see if a device and function number exists on a bus number. It then checks the PCI function’s base address registers to check the resource types and amount of resources the device is requesting. It also check pre defined location in the devices’ PCI configuration spaces for special capabilities. BIOS then uses this information to build a resource map, and hands out memory and IO addresses to the devices. For devices sharing a bus, the collective ranges should be continuous because the bridge that provides the bus needs to know is an address range decodes to its bus, or to pass the request down the line. Finally, the bus is checked to see if a device needs to run what is called an option ROM. These are things like video BIOS, RAID BIOS, etc. These extend the platform’s capabilities by modifying entries in the interrupt vector table (IVT). It’s worth taking a moment to discuss the IVT as it is a primary interface BIOS provides for the OS, and is quite worthless once the OS loads. It is a fundamental part of the 8088 architecture. Basically, when the processor receives an interrupt either via the #INT pin, or the int op code, it checks for a number attached to the interrupt and resolves that number to an address in the IVT. This entry is a pointer in segment offset format to the code to handle the interrupt. What this means is that say the processor gets a interrupt of value 0x10. This resolves to the 0x40th dword in memory. The lower 16 bits of this dword are the segment to the interrupt handler, the upper 16 bytes are the offset into that segment. These handlers can exist just about anywhere in the lower 1mb of memory. Int 0x10 corresponds to video services which are provided via the video hardware. Usually, the video BIOS is mapped at the beginning of the 0xC000 segment. So, when the BIOS runs the video BIOS, the video BIOS will change the entry at 0x40 to point to something in it’s BIOS in the 0xC000 segment. RAID cards work the same way (but are usually mapped to a different address). But all of this is 16 bit code, and quite worthless once the OS switches the system to protected or long mode. So they just waste space once the OS is running. If there is a PnP ISA bus system, BIOS also handles this, but these are uncommon outside of certain niche computing fields that I am not sure it is worth discussing. I personally have never had to deal with this as I entered the field this bus became very rare. It is also possible for BIOS to handle other expansion buses, but one reason PCIe holds its place in the PC instead of the other competing “third generation IO” technologies is that it is programmed almost exactly like PCI which means most of the PCI code is reusable. Now the reason for all the work to get PCI running; the BIOS can now discover bootable devices (well, there are others, but this is a big one). These are things like hard drives, CD-ROMs, floppy disks, USB drives, etc. They needed to be discovered so they could run their specific BIOSes and add to the platform. The BIOS will then query those devices through their designated interfaces (such as int 0x13) to look for bootable media. Finally, BIOS will invoke int 0x19 for initial program load (IPL) and control is passed to the boot sector of the media the user selected in the setup utility. In some cases, it is possible for control to come back to the BIOS if there is no boot sector such as an unformatted disk, or no media in the drive. The BIOS then tries the next boot device in the list and runs until something takes control. But the role of BIOS doesn’t end there. The scope of BIOS has been extended more than once and there are a number of other things BIOS does for the system. As PCs have changed, a number of BIOS interfaces were added. The specifications for these are spread out, and I can’t think of any online source where they are all gathered up. This is assuming the specification is publicly available. But not all of them are used anymore. They may have been superseded by other specifications. This has happened as various companies have done their own thing and gained enough market share and support to have everyone use it. But pretty much everything IBM defined for the first PCs is still there too. One example is the BIOS data area (BDA). It contains information about how the system is set up, but is taken from the point of view of the original PC. So it will list information about the text mode the system supports, if it can do better than EGA, information on the COM and LPT ports, etc. It also contains a few small buffers for BIOS to pass information to the OS. You can check out http://www.bioscentral.com/misc/bda.htm for more information on the BDA. The biggest interface right now is the Advanced Configuration and Power Interface (ACPI). It consists of sets of tables that describe hardware as well as program routines written in a script language defined in the specification. So, when the OS wants to put the system to “sleep” (called S3 for the number of the sleep state), the OS prepares its resources for sleep, then calls the appropriate script language method. There are a number of things that can happen from here, some which are under the OS’s control, but some are not. Fir example, the script may cause the PC to enter system management mode (SMM). When the system enters this mode, the state of all the processors in the system gets saved so they can be restored as if nothing happened at the end of the system management interrupt (SMI). The system will do work in this mode that it otherwise could not otherwise do under control of the OS. Now that I have introduced it, SMM is another thing BIOS leaves behind to use after the OS is running. But it is not under OS control so it isn’t always viewed favorably. There are a number of things it can do though and can be essential for working around limitations of both the platform and the OS. Keyboard support without any keyboard controller is one example. If you see an option in your BIOS setup that talks about 60/64 or keyboard emulation, this what that option is for. Another important piece of data BIOS provides is the E820 table, named for the int 0x15 sub function 0xE820. The OS grabs this while the boot loader is still in 16 bit mode because the e820 table describes the physical address ranges free for the OS to use, and what must not be disturbed. Some of the regions defined might be where the flash part is aliased in RAM, what memory is being used by chipset base address registers, and other places it would generally be bad for the OS to try and write to. The OS uses this information to build the data structures it needs to put the system into protected mode. Another interface I have seen used in a few cases is something called the multi processor table (MP table). This was a specification created before ACPI took over the same tasks this interface performs. One of its major goals was handling interrupts when more than 1 processor is present. The legacy 8259 programmable interrupt controller (PIC) can only signal an interrupt to a single processor. To overcome this, Intel developed the Advanced Programmable Interrupt Controller (APIC) which is a much more complicated, but much more capable piece of hardware. The MP table contains a bunch of information about these as well as about the processors in the system, and a few other bits and pieces. These are the major things BIOS provides for the system that I can think of. But I’m sure I’m forgetting some things since BIOS is not my main focus. Hopefully though, this is a good overview of what that mysterious bit of code is, and might help to understand some of the upcoming technology.
Wow... Looks like you spent a lot of time on that. It's a complex topic so perhaps breaking your article down into different sections would help to reduce the complexity. I do love learning about computer system details so I quite thoroughly enjoyed reading this.
Some interesting information. I haven't done any BIOS programming myself, but I do have several years of practice writing code in assembler. Pretty much a lost art these days.
I don't write code in it as my primary language for work, but I still look at a lot of disassembled code when debugging. In line assembly use to be useful as well. Sadly, the 64 bit compiler we use at work doesn't lok it, nor does it like me trying to emit op codes directly either I will try and see if I can make a little better organization out of all this. it was more of a brain dump that I tied to make readable. What I really wanted to emphasise is just how much goes on before you ever see video, and that the only ways to learn about a system's state before all this work is done are POST and beep codes. I also wanted to try an explain that in order to boot, the system needs to look like something that was made in the late 80s to the boot loader software.