mov si, call_tbl ; SI=Call table pointer
call [call_tbl] ; Call print_char using near indirect absolute call
; via memory operand
call [ds:call_tbl] ; Call print_char using near indirect absolute call
; via memory operand w/segment override
call near [si] ; Call print_char using near indirect absolute call
; via register
[ORG 0x7c00]
[Bits 16]
section .text
xor ax, ax
mov ds, ax ; DS=0x0000 since OFFSET=0x7c00
cli ; Turn off interrupts for potentially buggy 8088
mov ss, ax
mov sp, 0x7c00 ; SS:SP = Stack just below 0x7c00
sti ; Turn interrupts back on
mov si, call_tbl ; SI=Call table pointer
mov al, [char_arr] ; First char to print 'B' (beginning)
call print_char ; Call print_char directly (relative jump)
mov al, [char_arr+1] ; Character to print 'M' (middle)
call [call_tbl] ; Call print_char using near indirect absolute call
; via memory operand
call [ds:call_tbl] ; Call print_char using near indirect absolute call
; via memory operand w/segment override
call near [si] ; Call print_char using near indirect absolute call
; via register
mov al, [char_arr+2] ; Third char to print 'E' (end)
call print_char ; Call print_char directly (relative jump)
hlt ; Halt processor
jmp .endloop
mov ah, 0x0e ; Write CHAR/Attrib as TTY
mov bx, 0x00 ; Page 0
int 0x10
; Near call address table with one entry
call_tbl: dw print_char
; Simple array of characters
char_arr: db 'BME'
; Bootsector padding
times 510-($-$$) db 0
dw 0xAA55
您的代码正确设置了 DS ,并设置了自己的堆栈( SS 和 SP ).您没有盲目地将 CS 复制到 DS ,但是您要做的是依靠 CS 作为期望值(0x0000).在解释我的意思之前,我想先请您注意我最近给出的 Stackoverflow答案 ORG 指令(或任何链接程序指定的起点)与BIOS用于跳转到物理地址0x07c00的segment:offset对一起使用.
答案详细说明了在引用内存地址(例如变量)时将 CS 复制到 DS 会如何引起问题.在总结中我说:
关键是不要假设CS是我们期望的值.因此,您的下一个问题可能是-我似乎不是在使用 CS 吗?答案是肯定的.通常,当您使用典型的 CALL 或 JMP 指令时,它看起来像这样:
call print_char
jmp somewhereelse
在16位代码中,这两个都是相对跳转.这意味着您可以在内存中向前或向后跳转,但可以相对于 JMP 或 CALL 之后的指令进行偏移.您的代码在段中的放置位置无关紧要,因为它是当前位置的正/负位移. CS 的当前值实际上与相对跳转无关紧要,因此它们应该按预期工作.
call [call_tbl] ; Call print_char using near indirect absolute call
; via memory operand
call [ds:call_tbl] ; Call print_char using near indirect absolute call
; via memory operand w/segment override
call near [si] ; Call print_char using near indirect absolute call
; via register
所有这些都有一个共同点. CALL 或 JMP 的地址是 ABSOLUTE ,而不是相对的.标签的偏移量将受 ORG (代码的原点)的影响.如果我们查看您的代码的反汇编,则会看到以下内容:
objdump -mi8086 -Mintel -D -b binary boot.bin --adjust-vma 0x7c00
boot.bin: file format binary
Disassembly of section .data:
00007c00 <.data>:
7c00: 31 c0 xor ax,ax
7c02: 8e d8 mov ds,ax
7c04: fa cli
7c05: 8e d0 mov ss,ax
7c07: bc 00 7c mov sp,0x7c00
7c0a: fb sti
7c0b: be 34 7c mov si,0x7c34
7c0e: a0 36 7c mov al,ds:0x7c36
7c11: e8 18 00 call 0x7c2c ; Relative call works
7c14: a0 37 7c mov al,ds:0x7c37
7c17: ff 16 34 7c call WORD PTR ds:0x7c34 ; Near/Indirect/Absolute call
7c1b: 3e ff 16 34 7c call WORD PTR ds:0x7c34 ; Near/Indirect/Absolute call
7c20: ff 14 call WORD PTR [si] ; Near/Indirect/Absolute call
7c22: a0 38 7c mov al,ds:0x7c38
7c25: e8 04 00 call 0x7c2c ; Relative call works
7c28: fa cli
7c29: f4 hlt
7c2a: eb fd jmp 0x7c29
7c2c: b4 0e mov ah,0xe ; Beginning of print_char
7c2e: bb 00 00 mov bx,0x0 ; function
7c31: cd 10 int 0x10
7c33: c3 ret
7c34: 2c 7c sub al,0x7c ; 0x7c2c offset of print_char
; Only entry in call_tbl
7c36: 42 inc dx ; 0x42 = ASCII 'B'
7c37: 4d dec bp ; 0x4D = ASCII 'M'
7c38: 45 inc bp ; 0x45 = ASCII 'E'
7dfd: 00 55 aa add BYTE PTR [di-0x56],dl
我在 CALL 语句所在的位置手动添加了一些注释,包括有效的相对注释和邻近/间接/绝对注释可能不起作用.我还确定了print_char
位于0x7c34处,并且包含2个字节的绝对偏移量0x7c2c.这都是正确的,但是当您使用绝对2字节偏移量时,假定它位于当前的 CS 中.如果您已阅读 Stackoverflow答案(我之前已经提到过),那么当错误的 DS 时会发生什么>和offset用于引用变量,您现在可能意识到这可能适用于 JMP 的 CALL ,该调用使用涉及 NEAR 的绝对偏移2字节绝对值.
call [call_tbl]
从DS:[call_tbl]加载.启动引导加载程序时,我们将 DS 正确设置为0x0000,这样可以从内存地址0x0000:0x7c34正确检索值0x7c2c.然后,处理器将假定它与当前设置的 CS 相关,将IP = 0x7c2c设置为BUT.由于我们不能假定 CS 是期望值,因此处理器可能会将CALL或JMP调用到错误的位置.这完全取决于用于跳转到引导加载程序的BIOS CS:IP (可能有所不同).
如果 BIOS 对我们的引导加载程序执行的 FAR JMP 等效于0x0000:0x7c00,则 CS 将设置为0x0000和 IP 到0x7c00.当我们遇到call [call_tbl]
时,它将解决为 CALL 到CS:IP = 0x0000:0x7c2c.这是物理地址(0x0000 <4)+ 0x7c2c = 0x07c2c,实际上是物理上该函数在存储器中的print_char
某些BIOS在我们的引导加载程序处将 FAR JMP 等同于0x07c0:0x0000, CS 将设置为0x07c0,而 IP 设置为0x0000.这也映射到物理地址(0x07c0< call [call_tbl]时,它将解析为 CALL 到CS:IP = 0x07c0:0x7c2c.这是物理地址(0x07c0<< 4)+ 0x7c2e = 0x0f82c.这显然是错误的,因为print_char
错误地设置了 CS 会导致 JMP 和 CALL 指令的执行近/绝对寻址的问题.以及所有使用段替代CS:
的内存操作数.在 Stackoverflow答案
由于已经显示出我们不能依赖在BIOS跳转到我们的代码时设置的 CS ,因此我们可以自己设置 CS .要设置 CS ,我们可以对自己的代码执行 FAR JMP ,这会将 CS:IP 设置为对ORG有意义的值(起源点代码和数据).如果使用ORG 0x7c00,则发生这种跳转的示例:
jmp 0x0000:$+5
jmp 0x0000:farjmp
这两个指令之一完成后, CS 将设置为0x0000,而 IP 将设置为下一条指令的偏移量.他们对我们来说关键的是 CS 将为0x0000.与0x7c00的ORG配对时,它将正确解析绝对地址,以便它们在CPU上物理运行时可以正常工作. 0x0000:0x7c00 =(0x0000<< 4)+ 0x7c00 =物理地址0x07c00.
当然,如果我们使用ORG 0x0000,则需要将 CS 设置为0x07c0.这是因为(0x07c0<< 4)+ 0x0000 = 0x07c00.这样我们就可以用这种方式对远处的jmp进行编码:
jmp 0x07c0:$+5
CS 将设置为0x07c0,而 IP 将设置为下一条指令的偏移量.
所有这些的最终结果是,我们将 CS 设置为所需的段,而不依赖于BIOS不能完全跳转到我们的代码时无法保证的值. /p>
正如我们所见, CS 可能很重要.大多数BIOS,无论是在仿真器,虚拟机还是实际硬件中,都相当于跳到了0x0000:0x7c00,并且在那些环境中,引导加载程序可以正常工作.从 CD 引导时,一些较早的AMI Bioses和 Bochs 2.6之类的环境正在使用 CS:IP = 0x07c0:0x0000启动我们的引导程序.如在那些接近/绝对 CALL 和 JMP 的环境中所讨论的,将继续从错误的内存位置执行并导致我们的引导加载程序无法正常工作.
那么 Bochs 用于软盘映像而不是 ISO 映像怎么办?这在 Bochs 的早期版本中是很特殊的.从软盘启动时,虚拟BIOS跳至0x0000:0x7c00,而从ISO映像启动时则使用0x07c0:0x0000.这就解释了为什么它的工作方式有所不同.这种奇怪的行为显然是由于对El Torito规范之一的字面解释而引起的,该规范特别提到了0x07c0段. Boch 的虚拟BIOS的较新版本已被修改为两者都使用0x0000:0x7c00.
这个问题的答案是主观的.在IBM PC-DOS的第一个版本(2.1之前的版本)中,引导加载程序假定BIOS跳至0x0000:0x7c00,但这没有明确定义. 80年代的某些BIOS制造商开始使用0x07c0:0x0000并破坏了 DOS 的某些早期版本.发现此错误后,对引导加载程序进行了修改,使其表现良好,以便不对使用哪个segment:offset对到达物理地址0x07c00做任何假设.当时可能有人认为这是一个错误,但它是基于20位segment:offset对引入的歧义.
自80年代中期以来,我认为假定 CS 为特定值的任何新引导加载程序均已编码错误.
General Problem
I've been developing a simple bootloader and have stumbled on a problem on some environments where instructions like these don't work:
mov si, call_tbl ; SI=Call table pointer
call [call_tbl] ; Call print_char using near indirect absolute call
; via memory operand
call [ds:call_tbl] ; Call print_char using near indirect absolute call
; via memory operand w/segment override
call near [si] ; Call print_char using near indirect absolute call
; via register
Each one of these happen to involve indirect near CALL to absolute memory offsets. I have discovered that I have issues if I use similar JMP tables. Calls and Jumps that are relative don't seem to be affected. Code like this works:
call print_char
I have taken the advice presented on Stackoverflow by posters discussing the dos and don'ts of writing a bootloader. In particular I saw this Stackoverflow answer with General Bootloader Tips. The first tip was:
Taking all the advice, I didn't rely on CS, I set up a stack, and set DS to be appropriate for the ORG (Origin offset) I used. I have created a Minimal Complete Verifiable example that demonstrates the problem. I built this using NASM, but it doesn't seem to be a problem specific to NASM.
Minimal Example
The code to test is as follows:
[ORG 0x7c00]
[Bits 16]
section .text
xor ax, ax
mov ds, ax ; DS=0x0000 since OFFSET=0x7c00
cli ; Turn off interrupts for potentially buggy 8088
mov ss, ax
mov sp, 0x7c00 ; SS:SP = Stack just below 0x7c00
sti ; Turn interrupts back on
mov si, call_tbl ; SI=Call table pointer
mov al, [char_arr] ; First char to print 'B' (beginning)
call print_char ; Call print_char directly (relative jump)
mov al, [char_arr+1] ; Character to print 'M' (middle)
call [call_tbl] ; Call print_char using near indirect absolute call
; via memory operand
call [ds:call_tbl] ; Call print_char using near indirect absolute call
; via memory operand w/segment override
call near [si] ; Call print_char using near indirect absolute call
; via register
mov al, [char_arr+2] ; Third char to print 'E' (end)
call print_char ; Call print_char directly (relative jump)
hlt ; Halt processor
jmp .endloop
mov ah, 0x0e ; Write CHAR/Attrib as TTY
mov bx, 0x00 ; Page 0
int 0x10
; Near call address table with one entry
call_tbl: dw print_char
; Simple array of characters
char_arr: db 'BME'
; Bootsector padding
times 510-($-$$) db 0
dw 0xAA55
I build both an ISO image and a 1.44MB floppy image for test purposes. I'm using a Debian Jessie environment but most Linux distros would be similar:
nasm -f bin boot.asm -o boot.bin
dd if=/dev/zero of=floppy.img bs=1024 count=1440
dd if=boot.bin of=floppy.img conv=notrunc
mkdir iso
cp floppy.img iso/
genisoimage -quiet -V 'MYBOOT' -input-charset iso8859-1 -o myos.iso -b floppy.img -hide floppy.img iso
I end up with a floppy disk image called floppy.img
and an ISO image called myos.iso
Expectations vs Actual Results
Under most conditions this code works, but in a number of environments it doesn't. When it works it simply prints this on the display:
I print out B
using a typical CALL with relative offset it seems to work fine. In some environments when I run the code I just get:
And then it appears to just stop doing anything. It seems to print out the B
properly but then something unexpected happens.
Environments that seem to work:
- QEMU booted with floppy and ISO
- VirtualBox booted with floppy and ISO
- VMWare 9 booted with floppy and ISO
- DosBox booted with floppy
- Officially packaged Bochs(2.6) on Debian Jessie using floppy image
- Bochs 2.6.6(built from source control) on Debian Jessie using floppy image and ISO image
- AST Premmia SMP P90 system from mid 90s using floppy and ISO
Environments that don't work as expected:
- Officially packaged Bochs(2.6) on Debian Jessie using ISO image
- 486DX based system with AMI BIOS from the early 90s using floppy image. CDs won't boot on this system so the CD couldn't be tested.
What I find interesting is that Bochs (version 2.6) doesn't work as expected on Debian Jessie using an ISO. When I boot from the floppy with the same version it works as expected.
In all cases the ISO and the floppy image seemed to load and start running since in ALL cases it was at least able to print out B
on the display.
My Questions
- When it fails, why does it only print out a
and nothing more? - Why do some environments work and others fail?
- Is this a bug in my code or the hardware/BIOS?
- How can I fix it so that I can still use near indirect Jump and Call tables to absolute memory offsets? I am aware I can avoid these instructions altogether and that seems to solve my problem, but I'd like to be able to understand how and if I can use them properly in a bootloader.
The Problem
The answer to your question is buried in your question, it just isn't obvious. You quoted my General Bootloader Tips:
Your code correctly sets up DS, and sets its own stack (SS, and SP). You didn't blindly copy CS to DS, but what you do do is rely on CS being an expected value (0x0000). Before I explain what I mean by that, I'd like to draw your attention to a recent Stackoverflow answer I gave about how the ORG directive (or the origin point specified by any linker) works together with the segment:offset pair used by the BIOS to jump to physical address 0x07c00.
The answer details how CS being copied to DS can cause problems when referencing memory addresses (variables for example). In the summary I stated:
The key thing is Don't assume CS is a value we expect. So your next question may be - I don't seem to be using CS am I? The answer is yes. Normally when you use a typical CALL or JMP instruction it looks like this:
call print_char
jmp somewhereelse
In 16 bit-code both of these are relative jumps. This means that you jump forward or back in memory but as an offset relative to the instruction right after the JMP or CALL. Where your code is placed within a segment doesn't matter as it is a plus/minus displacement from where you currently are. What the current value of CS is doesn't actually matter with relative jumps, so they should work as expected.
Your example of instructions that don't always seem to work correctly included:
call [call_tbl] ; Call print_char using near indirect absolute call
; via memory operand
call [ds:call_tbl] ; Call print_char using near indirect absolute call
; via memory operand w/segment override
call near [si] ; Call print_char using near indirect absolute call
; via register
All of these have one thing in common. The addresses that are CALLed or JMPed are ABSOLUTE, not relative. The offset of the label will be influenced by the ORG (origin point of the code). If we look at a disassembly of your code we will see this:
objdump -mi8086 -Mintel -D -b binary boot.bin --adjust-vma 0x7c00
boot.bin: file format binary
Disassembly of section .data:
00007c00 <.data>:
7c00: 31 c0 xor ax,ax
7c02: 8e d8 mov ds,ax
7c04: fa cli
7c05: 8e d0 mov ss,ax
7c07: bc 00 7c mov sp,0x7c00
7c0a: fb sti
7c0b: be 34 7c mov si,0x7c34
7c0e: a0 36 7c mov al,ds:0x7c36
7c11: e8 18 00 call 0x7c2c ; Relative call works
7c14: a0 37 7c mov al,ds:0x7c37
7c17: ff 16 34 7c call WORD PTR ds:0x7c34 ; Near/Indirect/Absolute call
7c1b: 3e ff 16 34 7c call WORD PTR ds:0x7c34 ; Near/Indirect/Absolute call
7c20: ff 14 call WORD PTR [si] ; Near/Indirect/Absolute call
7c22: a0 38 7c mov al,ds:0x7c38
7c25: e8 04 00 call 0x7c2c ; Relative call works
7c28: fa cli
7c29: f4 hlt
7c2a: eb fd jmp 0x7c29
7c2c: b4 0e mov ah,0xe ; Beginning of print_char
7c2e: bb 00 00 mov bx,0x0 ; function
7c31: cd 10 int 0x10
7c33: c3 ret
7c34: 2c 7c sub al,0x7c ; 0x7c2c offset of print_char
; Only entry in call_tbl
7c36: 42 inc dx ; 0x42 = ASCII 'B'
7c37: 4d dec bp ; 0x4D = ASCII 'M'
7c38: 45 inc bp ; 0x45 = ASCII 'E'
7dfd: 00 55 aa add BYTE PTR [di-0x56],dl
I've manually added some comments where the CALL statements are, including both the relative ones that work and the near/indirect/absolute ones may not. I've also identified where the print_char
function is, and where it was in the call_tbl
From the data area after the code we do see that the call_tbl
is at 0x7c34 and it contains a 2 byte absolute offset of 0x7c2c. This is all correct, but when you use an absolute 2-byte offset it is assumed to be in the current CS. If you have read this Stackoverflow answer (that I referenced earlier) about what happens when the wrong DS and offset is used to reference a variable, you might now realize that this may apply to JMPs CALLs that use absolute offsets involving NEAR 2-byte absolute values.
As an example let us take this call that doesn't always work:
call [call_tbl]
is loaded from DS:[call_tbl]. We properly set DS to 0x0000 when we start the bootloader so this does correctly retrieve the value 0x7c2c from memory address 0x0000:0x7c34. The processor will then set IP=0x7c2c BUT it assumes it is relative to the currently set CS. Since we can't assume CS is an expected value, the processor potentially can CALL or JMP to the wrong location. It all depends on what CS:IP the BIOS used to jump to our bootloader with (it can vary).
In the case where the BIOS does the equivalent of a FAR JMP to our bootloader at 0x0000:0x7c00, CS will be set to 0x0000 and IP to 0x7c00. When we encounter call [call_tbl]
it would have resolved to a CALL to CS:IP=0x0000:0x7c2c . This is physical address (0x0000<<4)+0x7c2c=0x07c2c which is in fact where the print_char
function in memory that the function physically starts at.
Some BIOSes do the equivalent of a FAR JMP to our bootloader at 0x07c0:0x0000, CS will be set to 0x07c0 and IP to 0x0000. This too maps to physical address (0x07c0<<4)+0=0x07c00 .When we encounter call [call_tbl]
it would have resolved to a CALL to CS:IP=0x07c0:0x7c2c . This is physical address (0x07c0<<4)+0x7c2e=0x0f82c. This is clearly wrong since the print_char
function is at physical address 0x07c2c, not 0x0f82c.
Having CS set incorrectly will cause problems for JMP and CALL instructions that do Near/Absolute addressing. As well any memory operands that use a segment override of CS:
. An example of using the CS:
override in a real mode interrupt handler can be found in this Stackoverflow answer
Since it has been shown that we can't rely on CS that is set when the BIOS jumps to our code we can set CS ourselves. To set CS we can do a FAR JMP to our own code which will set CS:IP to values that make sense for the ORG (origin point of the code and data) we are using. An example of such a jump if we use ORG 0x7c00:
jmp 0x0000:$+5
says to use an offset that is 5 above our current program counter. A far jmp is 5 bytes long so this has the affect of doing a far jump to the instruction after our jmp. It could have been coded this way too:
jmp 0x0000:farjmp
When either of these instructions is complete CS will be set to 0x0000 and IP will be set to the offset of the next instruction. They key thing for us is that CS will be 0x0000. When paired with an ORG of 0x7c00 it will properly resolve absolute addresses so that they work properly when physically running on the CPU. 0x0000:0x7c00=(0x0000<<4)+0x7c00=physical address 0x07c00.
Of course if we use ORG 0x0000 then we need to set CS to 0x07c0. This is because (0x07c0<<4)+0x0000=0x07c00. So we could code the far jmp this way:
jmp 0x07c0:$+5
CS will be set to 0x07c0 and IP will be set to the offset of the next instruction.
The end result of all this is that we are setting CS to the segment we want, and not rely on a value that we can't guarantee when the BIOS finishes jumping to our code.
Issues with Different Environments
As we have seen the CS can matter. Most BIOSes whether in an emulator, virtual machine or real hardware do the equivalent of a far jump to 0x0000:0x7c00 and in those environments your bootloader would have worked. Some environment like older AMI Bioses and Bochs 2.6 when booting from a CD are starting our bootloader with CS:IP = 0x07c0:0x0000. As discussed in those environments near/absolute CALLs and JMPs will proceed to execute from the wrong memory locations and cause our bootloader to function incorrectly.
So what about Bochs working for a floppy image and not for an ISO image? This is a peculiarity in earlier versions of Bochs. When booting from a floppy the virtual BIOS jumps to 0x0000:0x7c00 and when it boots from an ISO image is uses 0x07c0:0x0000. This explains why it works differently. This odd behavior apparently came about because of literal interpretation of one of the El Torito specifications that specifically mentioned segment 0x07c0. Newer versions of Boch's virtual BIOSes were modified to use 0x0000:0x7c00 for both.
Does this Mean some BIOSes have a Bug?
The answer to this question is subjective. In the first versions of IBM's PC-DOS (prior to 2.1) the bootloader assumed that the BIOS jumped to 0x0000:0x7c00, but this wasn't clearly defined. Some BIOS manufacturers in the 80s started using 0x07c0:0x0000 and broke some early versions of DOS. When this was discovered bootloaders were modified to be well behaved as to not make any assumptions about what segment:offset pair was used to reach physical address 0x07c00. At the time one may have considered this a bug, but was based on the ambiguities introduced with 20-bit segment:offset pairs.
Since the mid 80s, it is my opinion that any new bootloader that assumes CS is a specific value has been coded in error.