-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Fix the guid for the transaction #32573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Servicing candidate? |
|
System.Data.OleDb.Tests are failing with this change. |
|
@jkotas I am looking into the test failure. I haven't been able to replicate the failure on my dev box. The tests are passing for me. I am testing using This could be a test issue, and my testing pattern differs from the rest of the test class, because I am not reusing the connection, but creating a new one. I will submit a change to try out the hypothesis. Potentially because there are parallel connections being opened for a shared resource viz a csv file. |
|
Note that the tests are failing on Win7 and Win81 only. Perhaps that may be a factor? |
|
I could get a repro while executing tests from Visual Studio. However I couldn't repro the same call stack from Command Line. From command line I started seeing AVs. (which isn't good either) While looking into the code I realized that there are duplicate instances of ITransactionLocal object with the same GUID (after I made this change). This causes failure to execute the The existing duplicate interface is at runtime/src/libraries/System.Data.OleDb/src/UnsafeNativeMethods.cs Lines 741 to 758 in 1c498fe
Since the Guid for the two transaction types were different, there were no problems being surfaced during test execution. However after fixing the Guid in the duplicate ITrasnsactionLocal, there seems to be some problem with calling the delegate above which either surfaces as an AV or with The duplicate interface is runtime/src/libraries/System.Data.OleDb/src/SafeHandles.cs Lines 197 to 236 in 1c498fe
The duplicate interface was added after the port to Core to replicate the behavior of what Managed C++ part of System.Data.dll used to handle, which interacting with the ITransactionLocal interface. I will update the PR with the findings and continue to investigate in parallel to see if I can make sense of why the failures are happening. I have very limited knowledge of COM and OleDb APIs to quickly find a solution here. However I will update the PR with the proposed solution and see if there is any feedback. |
|
@AaronRobinsonMSFT any thoughts come to mind? |
|
This is a huge mess. The GUID on I can imagine various assumptions with how these are validated. Let me take a look at some of the failures when/if they come back. |
|
I am going to have to dig into this implementation tomorrow in order to help. It would appear that the OleDB API implementation decided to circumvent the entire RCW/CCW support and manually handle all COM interop - this seems like a bad idea, but fine. What this means is somewhere there is a offset issue or some assumption about the layout of a specific interface that is no longer valid now that the type in question has been updated. |
|
Since there haven't been any updates on this PR for a week, does that mean the changes won't be merged anytime soon? It'd be really nice to have this fixed. |
|
I completely forgot about this issue. Fixing this is clearly the correct thing here, but understanding the failure is important - especially considering the failure is consistent. I will try to look into this today. |
|
@saurabh500 I am unable to build this branch locally. I am getting the following: |
|
@AaronRobinsonMSFT This is a build break than sneaked through the CI. Merge current master into the branch to fix it. |
|
@AaronRobinsonMSFT this is strange. I didn't touch the path where your build is failing. Are you able to build the master? Perhaps a rebase of the branch with master might help ? |
|
@jkotas I just merged in origin/master and now am getting the following error. Please tell me this isn't another known issue. |
|
Scratch that. Nonsense happening locally. |
|
Actually no. This is from dotnet/runtime/master. Boo. see #32927 |
|
A brief update. This test appears to require Access 2016. The failure is in x64, but I can't install that on my machine because it can't be installed if Office (32-bit) is already installed. I hope it doesn't offend anyone that I am not going to uninstall my version of Office simply to attempt a local repro. Looking through the code locally, there are some points of interest to investigate. Based on an offline conversation with @saurabh500, the function throwing the exception is propagating an
Since OleDB is manually handling all their COM calls/marshaling this is entirely in the test owners control and nothing to do with built-in COM interop support. Since other tests seem to be working I am going to assume that the VTable is okay which means we need good ol' mixed-mode debugging. We should step through the Access 2016 implementation of |
|
I spent some time looking at this, and I couldn't pinpoint the exact problem, but my findings may be helpful to you. On .NET Framework, when an OleDbConnection is opened, MS Access creates a record-locking information file (.laccdb) in the same directory as the database. When the OleDbConnection is closed/disposed, the .laccdb file is deleted. However, on .NET Core, if you begin a transaction on a connection, the locking info file is created on open, but does not get deleted on close. (If you don't begin a transaction, the locking info file behaves normally.) On .NET Core, what I observed is that if you rapidly open several connections, the .laccdb file seems to grow in size each time, and it appears that the file has a hard size limit (I found to be 4KB). In my tests, once the file reached that maximum size, any future calls to OleDbConnection.Open() would fail.
Furthermore, I found that if I insert an explicit call to |
|
I'm not sure if anyone can confirm that doing a manual GC.Collect() between test runs causes the tests to pass, but that was my experience, so I believe I have found the problem. It looks like the issue is reference counting on the COM interfaces. The documentation for Marshal.GetObjectForIUnknown(IntPtr) states:
In SafeHandles.cs, there are three calls to Marshal.GetObjectForIUnknown() that create local variables that disappear into the ether and never have Marshal.ReleaseComObject() called on them (until garbage collection). In my tests, adding a Marshal.ReleaseComObject() call to each of these will fix the issue. For example: internal static unsafe OleDbHResult ITransactionAbort(System.IntPtr ptr)
{
OleDbHResult hr = OleDbHResult.E_UNEXPECTED;
ITransactionLocal transactionLocal = null;
RuntimeHelpers.PrepareConstrainedRegions();
try
{ }
finally
{
Guid IID_ITransactionLocal = typeof(ITransactionLocal).GUID;
hr = (OleDbHResult)Marshal.QueryInterface(ptr, ref IID_ITransactionLocal, out var pTransaction);
if (pTransaction != IntPtr.Zero)
{
transactionLocal = (ITransactionLocal)Marshal.GetObjectForIUnknown(pTransaction);
hr = (OleDbHResult)transactionLocal.Abort(IntPtr.Zero, false, false);
Marshal.ReleaseComObject(transactionLocal); // INSERT THIS LINE HERE
Marshal.Release(pTransaction);
}
}
return hr;
}Like I said, I believe this fixes the issue (presumably, the right way) and should allow this PR to be merged. Please let me know your thoughts. It'd be great if we could get this merged soon. Thanks. |
|
@FreddyD-GH Thanks for the investigation and providing the potential fix. |
|
@saurabh500 Just to clarify the changes I'm proposing: simply adding these 3 lines to the rest of your PR allowed things to work in my test environment: |
|
I did just that and now I have new failures in the tests that I am running locally. |
|
Thanks for the update. I just took a look at my tests (including the original tests in the repo) to double check that they all passed after the change (they did), and I realized that that I had forgotten the System.Data.OleDb library uses the text/csv format for its tests, not the accdb format. I apologize for probably sounding like a crazy person before when I was talking about .laccdb files and what not. The only way I could observe bugs from the changed transaction code was to write my own tests, and to do so, I used the driver with an accdb file (since this is my intended scenario). I assume using the driver with text/csv will have a similar issue due to not being disposed properly, but it could be that there are two different bugs. Hopefully, that's not the case, but it is a possibility. |
|
@FreddyD-GH Looks like the test issue was specific to my environment. The tests are passing now for me and the CI is green as well. @maryamariyan @jkotas and @stephentoub there are changes since you folks last reviewed this. Can you take a look and let me know if the changes look good ? |
jkotas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Cool. Will it go into 5.0 or there are chances to see it in 2.1/3.1 ? |
|
I need this in 3.1, please! |
For EntityFrameworkCore.Jet's EF Core 3.1 support we will need this as part of a service release targeting .NET Core 3.1 as well. |
|
The PR for porting this fix into servicing is at dotnet/corefx#42878 @jader1313 You can consume the nightly build of System.Data.OleDb to verify if your tests pass with this change. It is shipped as an out of box package and shouldn't need a runtime update. |
|
@saurabh500 thank you very much, it worked on my tests. |
|
@saurabh500 Works for EntityFrameworkCore.Jet as well. |
The guid for
ITransactionLocalwas incorrectly coded. The change fixes the Guid and also adds a test to verify this.The test was throwing an exception before the fix.
The guid was verified from oledb.h header and also was reported in #32405
Fixes #31177
Fixes #32405