Skip to content

Conversation

@Locke
Copy link
Contributor

@Locke Locke commented Jan 15, 2017

Occasionally some of my tests kept stuck in an endless loop. As it was not reproducible I didn't create an issue so far. However this week it got worse, so I decided to take a deeper look into it.

In order to narrow it down I rewrote waitWhileBusy in the Engine to use a blocking wait instead of Thread.yield which fortunately improved the execution time noticeably.

The actual issue turned out to be a race condition in the TestEngineDriver. It happened sometimes that one thread was waiting in resume for the status to become running while the inner thread was kept in a shouldPause loop.

Sketch of a possible execution order:

  1. TestEngineDriverStatus is paused
  2. executeSteps() is called from the outer thread. resume() sets shouldPause to false. As the status is still paused the outer thread yields
  3. the inner thread sees shouldPause as false, sets the status to running, continues to make steps, reaches the stepsLimit, sets shouldPause to true. It then sees shouldPause as true and sets TestEngineDriverStatus back to paused and sleeps
  4. the yielded outer thread starts again, checks the TestEngineDriverStatus sees it as paused and assumes hat it never was running

This only happened on a Core2 Duo with Linux, never on a i7 with 4 physical cores (8 logical) on Win10. As there are only two cores on the Linux machine I suspect the scheduler kept the yielded outer thread longer paused while it let the inner thread and the Engines threads running.

I began writing locks for the TestEngineDriver too, but noticed that the execution so tightly coupled that (in my eyes) it makes no sense at all to have an inner thread in the first place: executeSteps is blocked as long as the inner thread is running, and the inner thread is waiting for work until the next executeSteps is called. Therefore I removed the inner thread completely.

This PR also brings in some smaller changes to the pom.xml Feel free to pick only single commits, they should be completely independent.

// This is here to avoid leaving the engine in
// a state that is right before changing its
// state
boolean shouldRemoveFirstCommand = false;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: synchronized (engineMode) is no longer needed as isBusyLock is locked just before the call of processNextCommand

break;
}

if (shouldRemoveFirstCommand) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: isBusyChange.signalAll() is not needed here as it is called right after the call of processNextCommand

Locke added 6 commits April 12, 2017 15:31
Useful for external projects depending on test classes like the
TestEngineDriver.
PluginClassLoader.loadCatalog(..., "plugins/") tries to load plugins
from the test-rsc resp. from the test-jar as the folder name conflicts
with the actual plugin folder.
Replace Thread yielding with a blocking wait to improve cpu usage. As
engineBusy and engineMode are not final it is dangerous to synchronize
on them, therefore a ReentrantLock is used. For the blocking wait a
Condition is used that is signaled each time when the result of isBusy
might change.
@Locke
Copy link
Contributor Author

Locke commented Dec 15, 2017

@araschke since you recently pushed some commits and wrote in another issue that you oversaw that notification I assume no one noticed my two pull requests.

As this PR contains multiple commits I'm also happy to split this one into multiple independent PRs if it makes your review easier.

@araschke
Copy link
Member

araschke commented Dec 15, 2017 via email

Locke added 4 commits April 13, 2018 22:44
…ueue

I couldn't measure a perfomance difference, but this design looks better than something custom
Again: no measureable performance difference, but nicer design
This allows the Engine to be idle for a longer period of time without constantly yielding.

- the updates of `engineBusy` and locks of `isBusyLock` should now be easier to follow
-* i.e. idle == being in a blocking wait for the next command
- `isBusyChanged` has been changed to `isNotBusy`
@mstegmaier mstegmaier merged commit df59ae1 into CoreASM:master May 15, 2018
Locke referenced this pull request in Locke/coreasm.core Jan 10, 2019
The main loop assumes that the engine is busy, however the handling of
the emError state sets engineBusy to false. This causes an assertion,
that was added in c750929 via #26, to trigger, which leads in debug
runs (-ea flag set) to a crashed Engine and a blocked TestEngineDriver.

Additionaly the engineBusy flag is set to false even when the Engine
is actually busy as the emError state is left to either emIdle or
emTerminating.

This commit 1) changes the assertion to not throw when in emError state
and 2) corrects the setting of engineBusy to be false only when no
ecTerminate / ecRecover command was handled.

TODO: This basic fix leaves the Engine in a Thread.yield loop until a
ecTerminate / ecRecover command is present. The polling of the
commandQueue should be replaced with a blocking wait.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants