Support overriding most ORT session options via environment variables#16260
Support overriding most ORT session options via environment variables#16260ivberg wants to merge 3 commits intomicrosoft:mainfrom
Conversation
… for use in testing, debugging, perf evaluations
|
|
||
| // Necessary otherwise INFO logs below won't actually be logged | ||
| auto defaultLoggerOrigSeverity = default_logger.GetSeverity(); | ||
| logging::LoggingManager::SetDefaultLoggerSeverity(logging::Severity::kINFO); |
There was a problem hiding this comment.
Is this function call thread-safe?
There was a problem hiding this comment.
I believe not thread-safe in the sense that multiple threads could theoretically be setting severity and this would miss those changes. I don't know practically how much of an issue this is because at this point during session creation I am assuming there is only 1 thread involved. Note that the existing code has this issue where the INFO event is not actually being logged.
Overall not seeing the argument why this needs to be thread-safe, so if you can provide that reasoning that would be gr8 @snnn!
There was a problem hiding this comment.
Since multiple sessions could be created in the same process, a session's constructor should avoid changing the process's global settings.
|
The change will make onnx runtime hard to debug. When we see a test failure from a CI build pipeline, we often need to reproduce it locally. Then we need to know how to reproduce it. Usages of environment variables make it harder, since:
|
#16259 PR should solve this because it logs out all the session variables that are controlled by these. Also every override is logged. Therefore you would not need to dump our the process env variables. I would not expect that these debug variables are used in a CI pipeline. Using env variable for debugging is used elsewhere in the codebase - see ORT_DEBUG_NODE_IO* |
pranavsharma
left a comment
There was a problem hiding this comment.
This is not the direction we want to take. This is a maintenance nightmare for us as we'll need to keep introducing environment variables each time a new session/run option is introduced; it requires writing the code for override, writing tests to ensure the override works and documenting them. This kind of thing fits very well in the application code that calls ORT.
As mentioned in the use-case, doing this in the calling API can sometimes be a huge time sync recompiling (or impossible) if want to test out the effects of various params and get diagnostics or see affect on perf. Case in point WinML. How can you get a CPU profile trace with WinML. You can't today because WinML doesn't expose this session option and this change would allow a dev to do so on a production WinML / Onnx combo. The other debug options are not documented and did not have to pass this level of scrutiny but I am happy to document them. |
The env variables can reside in your code and you can control ORT behavior with them without recompiling any code by implementing the same override logic in the application code. I see no reason for ORT to take this burden on. |
You may not control the code even it's within the same company (Microsoft). Case in point WinML devs CANT do this in code calling WinML because it's not exposed. I feel this is quite a useful feature and I am not sure there is an adequate attempt to even understand the use-case |
I see this as a limitation of WinML, not ORT. ORT exposes a bunch of session options which can be controlled in ways that best suits the application. Happy to get on a call with the WinML team to help resolve this. |
|
Abandoning this PR. These overrides for testing will just live in my fork |
Description
Support overriding most ORT session options via environment variables for use in testing, debugging, perf evaluations
Motivation and Context
ORT Session options are key to controlling ORT behavior. This allows these session options to be overridden at session init time via debug environment variables. This is essential when debugging or investigating performance where a recompile would be burdensome or access to code calling ORT is difficult. This for example allows getting CPU profiling report, testing optimization level, # of threads, logging levels and other options on the fly.