Skip to content

aarch64/exceptions#182

Closed
markfirmware wants to merge 1 commit intoZystemOS:feature/aarch64-portfrom
markfirmware:aarch64/exceptions
Closed

aarch64/exceptions#182
markfirmware wants to merge 1 commit intoZystemOS:feature/aarch64-portfrom
markfirmware:aarch64/exceptions

Conversation

@markfirmware
Copy link

@markfirmware markfirmware commented Jul 9, 2020

This has a gitpod commit for my convenience, then a commit for better exception handling. I have removed the gitpod support.

See https://developer.arm.com/documentation/ddi0487/fb/ page 2342.

The approach in this pr is to send all exceptions to one handler that decodes the situation. In particular there is a fault in tty.clear(), which might be a compiler defect. If you open the pr in gitpod it will zig build run and show the fault.

For my convenience I modified build.zig to be dedicated to the aarch64 variant and also changed objcopy to addInstallRaw(). addInstallRaw() will be a separate pr

Hopefully the tty.clear() fault is not an artifact of the gitpod environment.

Cheers, Mark

@SamTebbs33 SamTebbs33 self-requested a review July 9, 2020 08:33
Copy link
Collaborator

@SamTebbs33 SamTebbs33 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this improvement. I'd be happy to merge it with some changes and removal of the gitpod stuff and majority of the changes to build.zig.

@SamTebbs33
Copy link
Collaborator

Another thing I should mention is that I see the data abort fault locally too. It happens when attempting to write a pixel to the framebuffer, and I'm not sure why.

@markfirmware
Copy link
Author

Another thing I should mention is that I see the data abort fault locally too. It happens when attempting to write a pixel to the framebuffer, and I'm not sure why.

What I get is

[INFO] program starting at exception level 3
[INFO] Init mem
[INFO] Done mem
[INFO] Init panic
[INFO] Done panic
[INFO] Init tty
[INFO] Done tty
[INFO] Init done
[INFO] tty.clear() tty.clear = fn() void@aaaaaaaaaaaaaaaa
[ERROR] 
[ERROR] instruction abort
[ERROR] exception level 3 - no change
[ERROR] 32 bit instruction at 0xaaaaaaaaaaaaaaaa accessing 0xaaaaaaaaaaaaaaa8
[ERROR] current exception level 3 exception entry number 0x4
[ERROR] vbar_el3 0x81000
[ERROR] elr_el3 aaaaaaaaaaaaaaaa
[ERROR] esr_el3 86000010 class 0x21
[ERROR] far_el3 aaaaaaaaaaaaaaa8
[ERROR] sctlr_el3 c50838
[ERROR] spsr_el3 600003cd
[ERROR] core 0 is now idle in arm exception handler (other cores were already idle)

This looks like tty.clear is set but to a function at the zig undefined value.

It will be good to get both of us getting exactly the same fault results. I will run this on a regular x64 host.

One more anomaly - in my project my program starts at exception level 2 and not 3. I wonder if it is the version of qemu.

@markfirmware
Copy link
Author

When I say my project starts at exception level 2 I mean my own rpi project, not my pluto fork.

@markfirmware markfirmware requested a review from SamTebbs33 July 10, 2020 16:16
@SamTebbs33
Copy link
Collaborator

SamTebbs33 commented Jul 10, 2020

Another thing I should mention is that I see the data abort fault locally too. It happens when attempting to write a pixel to the framebuffer, and I'm not sure why.

What I get is

[INFO] program starting at exception level 3
[INFO] Init mem
[INFO] Done mem
[INFO] Init panic
[INFO] Done panic
[INFO] Init tty
[INFO] Done tty
[INFO] Init done
[INFO] tty.clear() tty.clear = fn() void@aaaaaaaaaaaaaaaa
[ERROR] 
[ERROR] instruction abort
[ERROR] exception level 3 - no change
[ERROR] 32 bit instruction at 0xaaaaaaaaaaaaaaaa accessing 0xaaaaaaaaaaaaaaa8
[ERROR] current exception level 3 exception entry number 0x4
[ERROR] vbar_el3 0x81000
[ERROR] elr_el3 aaaaaaaaaaaaaaaa
[ERROR] esr_el3 86000010 class 0x21
[ERROR] far_el3 aaaaaaaaaaaaaaa8
[ERROR] sctlr_el3 c50838
[ERROR] spsr_el3 600003cd
[ERROR] core 0 is now idle in arm exception handler (other cores were already idle)

That's really weird, I see (with no changes to your PR)

[INFO] program starting at exception level 3
[INFO] Init mem
[INFO] Done mem
[INFO] Init panic
[INFO] Done panic
[INFO] Init tty
[INFO] Done tty
[INFO] Init done
[INFO] tty.clear() aarch64.tty.clear = fn() void@aaaaaaaaaaaaaaaa
[ERROR] 
[ERROR] arm exception taken to level 3
[ERROR] this exception has not been seen previously in development - please update interrupts.zig
[ERROR] details
[ERROR]     elr_el3 0xaaaaaaaaaaaaaaaa
[ERROR]     esr_el3 0x86000000 ExceptionClass.instruction_abort
[ERROR]     exception entry offset 0x200 ExceptionTakenFrom.same_level_while_using_sp_elx ExceptionCategory.synchronous
[ERROR]     far_el3 0xaaaaaaaaaaaaaaaa
[ERROR]     sctlr_el3 0xc50838
[ERROR]     spsr_el3 0x600003cd
[ERROR]     vbar_el3 0x80800

tty.clear shouldn't be undefined as it's set to null by arch.aarch64.initTTY... I can't think of why we'd be seeing different faults either.

Copy link
Collaborator

@SamTebbs33 SamTebbs33 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. I'm seeing the tty.clear issue on my aarch64 branch as well so I don't think it's something that should stop this being merged. Once these couple of changes are made I think it will be ready for approval.

@markfirmware
Copy link
Author

[INFO] tty.clear() aarch64.tty.clear = fn() void@aaaaaaaaaaaaaaaa
[ERROR]
[ERROR] arm exception taken to level 3
[ERROR] this exception has not been seen previously in development - please update interrupts.zig
[ERROR] details
[ERROR] elr_el3 0xaaaaaaaaaaaaaaaa
[ERROR] esr_el3 0x86000000 ExceptionClass.instruction_abort
[ERROR] exception entry offset 0x200 ExceptionTakenFrom.same_level_while_using_sp_elx ExceptionCategory.synchronous
[ERROR] far_el3 0xaaaaaaaaaaaaaaaa
[ERROR] sctlr_el3 0xc50838
[ERROR] spsr_el3 0x600003cd
[ERROR] vbar_el3 0x80800


`tty.clear` shouldn't be undefined as it's set to `null` by `arch.aarch64.initTTY`... I can't think of why we'd be seeing different faults either.

The value being undefined might be a compiler defect. I will check the open aarch64 issues at zig.

Your esr_el3.iss field is 0 whereas mine is 0x10. I have added yours to the recognized list. What is your qemu version? Mine is 3.1.0.

@markfirmware markfirmware requested a review from SamTebbs33 July 10, 2020 18:32
@SamTebbs33
Copy link
Collaborator

SamTebbs33 commented Jul 11, 2020

The value being undefined might be a compiler defect. I will check the open aarch64 issues at zig.

Thanks, I'll keep an eye out.

Your esr_el3.iss field is 0 whereas mine is 0x10. I have added yours to the recognized list. What is your qemu version? Mine is 3.1.0.

I'm using qemu-system-aarch64 version 5.0.0!

Copy link
Collaborator

@SamTebbs33 SamTebbs33 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this improvement. We like PRs to be squashed before merging so would you mind doing so before?

Since this is a feature branch there is a chance this file will be changed a bit while the aarch64 port is being worked on but it looks good for inclusion now.

Do you have GitHub actions enabled on your fork? I think you need to in order for the tests to run. After they've run and passed you should be able to merge this PR.

@markfirmware
Copy link
Author

Thanks for this improvement. We like PRs to be squashed before merging so would you mind doing so before?

Done.

Since this is a feature branch there is a chance this file will be changed a bit while the aarch64 port is being worked on but it looks good for inclusion now.

Understood.

Do you have GitHub actions enabled on your fork? I think you need to in order for the tests to run. After they've run and passed you should be able to merge this PR.

I do have them enabled - I do not know why they are not running.

@markfirmware markfirmware requested a review from SamTebbs33 July 11, 2020 23:15
@markfirmware
Copy link
Author

There are reports that gh actions have issues with forks. Can you try making a branch of aarch64-port and merging aarch64/exceptions? If it does not pass tests then we can decide what to do next. If it passes, you could go ahead and merge it with aarch64-port and destroy the temporary branch. I tried creating a fresh repository and running tests there and it also did not run the actions.

@markfirmware
Copy link
Author

I am going to close this pr and open a new one to get the tests to trigger.

@SamTebbs33
Copy link
Collaborator

Thanks @markfirmware. We managed to get the actions running in #131 so I'm not sure why it wouldn't be happening in this case. Perhaps we should look for CI alternatives since we don't want to run into this issue again. I've submitted my own PR with your commit so that we can get the tests run.

@markfirmware markfirmware deleted the aarch64/exceptions branch July 13, 2020 02:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants