Skip to content

Conversation

@laijs
Copy link

@laijs laijs commented Feb 14, 2017

  1. Current tls-based scheme causes some unneeded host task reschedules
    if the host thread B enters the syscall after the host thread A
    resumed to userspace, the host tasks behind
    the threads are different, and a reschedule is needed in
    current code. actually, this reschedule can be eliminated if
    the host thread B reuses the same host task of A.

  2. use stack(list_head) instead of tls
    when tls is not enabled, all the in-syscall threads
    compete on the running of the host0, it may cause
    wrong wakeup or even dead lock. Example, thread A
    had entered syscall and sleep-waited on something.
    thread B enters syscall after A, it will wake up the
    host0, thread A&B will compete the sched_sem of
    the host0 for running, if A win, A will resume running
    and sleep again soon. B can't run. even worse when
    A was waiting on the event which is depended by B.
    A&B will deadlock.

Signed-off-by: Lai Jiangshan jiangshanlai@gmail.com


This change is Reviewable

@lkl-jenkins
Copy link

Can one of the admins verify this patch?

@liuyuan10
Copy link
Member

This approach allows multiple host threads to use the same lkl kernel task to improve performance. But it also allows one host thread to use multiple lkl kernel tasks. Any task state is not preserved. What if a socket is created by one but accessed by another?

@laijs laijs force-pushed the lkl/reuse-host-task branch from e963c82 to 5521cad Compare February 14, 2017 06:08
@laijs
Copy link
Author

laijs commented Feb 14, 2017

Hello, could you tell more about it?

I think the host thread in the lkl-userspace-mod should not have kernel state associated. otherwise it will be broken even when "multiple host threads to use the same lkl kernel task".

What if a socket is created by one but accessed by another?

socket in the kernel can be accessed by multiple tasks. What will happen after this patch?

@laijs
Copy link
Author

laijs commented Feb 14, 2017

when a thread in the lkl-kernel mod and a signal happens and it call the lkl-syscall again.
this patch handles this situation differently from the old code. it seems neither way is good.

@tavip
Copy link
Member

tavip commented Feb 14, 2017

This will break the POSIX APIs when we will add support multiple processes. @thehajime is working on integration with rump which will allow that. So it needs to be at least configurable.

However, I don't think this is the path to take. I think our optimizations our going into the wrong direction, we should focus on keeping the regular context switch model and optimize the context switch (via userspace threads maybe).

@laijs
Copy link
Author

laijs commented Feb 14, 2017

What does "the regular context switch model" mean?
Does it mean "only on thread in userspace per one cpu"?
If a thread(created by app, not the lkl-kernel) calls lkl-syscall, it will become a thread of lkl-kernel and will be controlled(scheduled) by the lkl-kernel on its whole life in this mode, is it right?

@laijs
Copy link
Author

laijs commented Feb 14, 2017

This will break the POSIX APIs when we will add support multiple processes.

Could you tell me more about it?

1) Current tls-based scheme causes some unneeded host task reschedules
       If the host thread B enters the syscall after
   the host thread A resumed to userspace, the host tasks behind
   the threads are different, and a reschedule is needed in
   current code.
       actually, this reschedule can be eliminated if
   the host thread B reuses the same host task of A.

2) use stack(list_head) instead of tls
       when tls is not enabled, all the in-syscall threads
   compete on the running of the host0, it may cause
   wrong wakeup or even dead lock. Example, thread A
   had entered syscall and sleep-waited on something.
   thread B enters syscall after A, it will wake up the
   host0, thread A&B will compete the sched_sem of
   the host0 for running, if A win, A will resume running
   and sleep again soon. B can't run. even worse when
   A was waiting on the event which is depended by B.
   A&B will deadlock.

Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
@laijs laijs force-pushed the lkl/reuse-host-task branch from 5521cad to adaf6f2 Compare February 14, 2017 12:21
@liuyuan10
Copy link
Member

liuyuan10 commented Feb 14, 2017

socket in the kernel can be accessed by multiple tasks.

I missed it. Thanks for correcting me.

@tavip
Copy link
Member

tavip commented Feb 14, 2017

This will break the POSIX APIs when we will add support multiple processes.
Could you tell me more about it?

The idea is that we can connect different userspace processes to LKL instead of the host kernel. In that case we want to keep the process separation so we need to have a strict 1:1 user process to LKL task in order to keep file descriptors and other process resources separated.

See #298 on why this mode of operations is useful. With this mode you can basically "mount" a fs in userspace, then have utilities like cp connect via syscall proxy to the LKL instance and work out of the box.

@tavip
Copy link
Member

tavip commented Feb 14, 2017

What does "the regular context switch model" mean?
Does it mean "only on thread in userspace per one cpu"?
If a thread(created by app, not the lkl-kernel) calls lkl-syscall, it will become a thread of lkl-kernel and will be controlled(scheduled) by the lkl-kernel on its whole life in this mode, is it right?

OK, this might get a bit log, so please bare with me :)

Before the great optimizations Yuan did we where using the following simple model: for each host thread we had a Linux task associated which was running in a dedicated host thread that only served system calls.

For example, lets say we have only a main application thread. When LKL is initialized, it will create some host threads for the Linux kernel dedicated kernel threads (stuff like ksoftirqd, kworker, etc) as well as the idle thread. Later, when LKL calls /sbin/init, a new host thread is created (we call it a system call thread). This new thread will wait for system calls issued by the main application thread.

The problem we had with this approach was significant system call latency. For each system call the following was happening:

  • an interrupt is queued and the idle thread is waked up via sem_up
  • a host context switch happens; the current application thread is blocked and eventually the idle thread runs
  • the interrupt handler runs and the system call thread is waked up
  • another host context switch happens; the idle thread blocks and the system call thread runs
  • the system call is executed and if no blocking is required it finishes and it wakes up the application thread
  • another host context switch happens

So, as you can see, we have 3 host context switches. As, currently, there is no way to explicitly switch between threads at the host level, we have a significant penalty [1].

To avoid this host context switch latency we started to eliminate the dedicated host threads associated with the Linux tasks and thus we significantly reduced the system call latency.

Now back to my point :) While I think these optimizations are great in terms of performance, I think we are moving away too much from the Linux kernel task model and sooner or later we will run into trouble. So I am wondering: what if we take the other way around to achieve good performance, i.e. focus on reducing the host context switch. This can be done for example by using user threads instead of full kernel backed threads, which should be very cheap.

Doing this kind of work will have other benefits, as we will have a natural way of supporting environments that can't support kernel threads (like bootloaders, UEFI*, etc.).

  • I know that we currently support UEFI thanks to @M1cha's great work but I think we can do better if we modify a bit the architecture and get rid of host threads APIs and instead switch to something a bit more generic. For example, I think that supporting networking in UEFI is hard with the current host APIs we offer in LKL.

[1] http://www.linuxplumbersconf.org/2013/ocw/system/presentations/1653/original/LPC%20-%20User%20Threading.pdf

@laijs
Copy link
Author

laijs commented Feb 15, 2017

The idea is that we can connect different userspace processes to LKL instead of the host kernel.

sorry, I wrongly considerred 'support multiple processes' as 'SMP support'. but I'm clear now. thanks!

@tavip
Copy link
Member

tavip commented Feb 20, 2017

I am going to close this now as it looks like we are in agreement that such an approach is not reasonable. Please feel free to reopen it if I am wrong.

@tavip tavip closed this Feb 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants