Skip to content

[16.0][IMP] fs_storage: invalidate filesystem cache when connection fails#320

Merged
OCA-git-bot merged 1 commit intoOCA:16.0from
ForgeFlow:16.0-imp-fs_storage-store_connection
Feb 6, 2024
Merged

[16.0][IMP] fs_storage: invalidate filesystem cache when connection fails#320
OCA-git-bot merged 1 commit intoOCA:16.0from
ForgeFlow:16.0-imp-fs_storage-store_connection

Conversation

@JordiMForgeFlow
Copy link
Contributor

@JordiMForgeFlow JordiMForgeFlow commented Jan 12, 2024

@JordiMForgeFlow JordiMForgeFlow marked this pull request as draft January 12, 2024 10:00
@JordiMForgeFlow JordiMForgeFlow force-pushed the 16.0-imp-fs_storage-store_connection branch 4 times, most recently from 3707e52 to 302a29a Compare January 19, 2024 11:23
@JordiMForgeFlow JordiMForgeFlow force-pushed the 16.0-imp-fs_storage-store_connection branch from 302a29a to 91f673f Compare January 23, 2024 12:32
@JordiMForgeFlow JordiMForgeFlow changed the title [IMP] fs_storage: add flag to decide whether connection is stored and reused [IMP] fs_storage: add setting to store connection and invalidating filesystem cache Jan 23, 2024
@JordiMForgeFlow JordiMForgeFlow marked this pull request as ready for review January 23, 2024 12:32
@JordiMForgeFlow JordiMForgeFlow force-pushed the 16.0-imp-fs_storage-store_connection branch from 91f673f to 3c08fcc Compare January 23, 2024 12:38
@JordiMForgeFlow JordiMForgeFlow changed the title [IMP] fs_storage: add setting to store connection and invalidating filesystem cache [16.0][IMP] fs_storage: add setting to store connection and invalidating filesystem cache Jan 23, 2024
Copy link
Contributor

@lmignon lmignon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JordiMForgeFlow Thank you for this improvement. A few comments... In the same time can you add a file named 320.feature into the fs_storage/readme/newsframent with a description of this new feature. The content will be used at merge time to fill the addon's changelog.
Could you also update the USAGE.rst file to describe the new parameters?

@JordiMForgeFlow JordiMForgeFlow force-pushed the 16.0-imp-fs_storage-store_connection branch from 3c08fcc to 0d6a88a Compare January 23, 2024 14:58
Copy link
Contributor

@LoisRForgeFlow LoisRForgeFlow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

invalidate_cache_on_error maybe is not even needed as a setting and what it adds should be just the standard behavior, but not a big deal.

@JordiMForgeFlow JordiMForgeFlow force-pushed the 16.0-imp-fs_storage-store_connection branch 2 times, most recently from 20a1f72 to 17fb3c3 Compare January 25, 2024 07:40
@JordiMForgeFlow
Copy link
Contributor Author

JordiMForgeFlow commented Jan 25, 2024

@lmignon after trying using fs.exists and looking into more detail, I have put back using fs.ls.

The fs.exists does not raise an error, but simply returns true or false. Also, it basically calls the method info which ends up calling the method fs.ls but with detail=True, which is worse than what we had. (code can be seen here: https://filesystem-spec.readthedocs.io/en/latest/_modules/fsspec/spec.html#AbstractFileSystem)

@JordiMForgeFlow JordiMForgeFlow force-pushed the 16.0-imp-fs_storage-store_connection branch 3 times, most recently from 36a5624 to 4b50530 Compare January 25, 2024 09:25
@lmignon
Copy link
Contributor

lmignon commented Jan 26, 2024

@lmignon after trying using fs.exists and looking into more detail, I have put back using fs.ls.

The fs.exists does not raise an error, but simply returns true or false. Also, it basically calls the method info which ends up calling the method fs.ls but with detail=True, which is worse than what we had. (code can be seen here: https://filesystem-spec.readthedocs.io/en/latest/_modules/fsspec/spec.html#AbstractFileSystem)

My concern is about the performance cost when the root directory contains hundred of thousand files... To limit the scope of the ls command, it could be performed on a specific path to a marker file. What do you thing about?

@simahawk
Copy link
Contributor

@lmignon after trying using fs.exists and looking into more detail, I have put back using fs.ls.
The fs.exists does not raise an error, but simply returns true or false. Also, it basically calls the method info which ends up calling the method fs.ls but with detail=True, which is worse than what we had. (code can be seen here: filesystem-spec.readthedocs.io/en/latest/_modules/fsspec/spec.html#AbstractFileSystem)

My concern is about the performance cost when the root directory contains hundred of thousand files... To limit the scope of the ls command, it could be performed on a specific path to a marker file. What do you thing about?

I agree. We should probably create a file on 1st check (eg: .odoo_fs_storage_$id.marker).

@JordiMForgeFlow JordiMForgeFlow force-pushed the 16.0-imp-fs_storage-store_connection branch from 4b50530 to c076645 Compare January 29, 2024 08:48
@JordiMForgeFlow JordiMForgeFlow changed the title [16.0][IMP] fs_storage: add setting to store connection and invalidating filesystem cache [16.0][IMP] fs_storage: invalidate filesystem cache when connection fails Jan 29, 2024
@simahawk simahawk requested a review from sebalix January 29, 2024 10:07
@simahawk
Copy link
Contributor

@sebalix can you try by installing it from https://github.com/fsspec/filesystem_spec ?

@sebalix
Copy link
Contributor

sebalix commented Feb 2, 2024

Regarding connection failure, I replaced the use of the deprecated functions in edi_storage_oca there: OCA/edi-framework#65 (comment)
Copying my comment:

I'm suspecting (but I'm not sure) it was the use of these deprecated methods that triggers connection issues over time. These methods are decorated, and it could be that the decorator is keeping an old reference to self.__fs having a socket that gets closed after some time. That could explain why the "Test connection" button of fs_storage provides the expected result, while the EDI jobs are failing even when requeued.
This will be tested anyway.

Maybe it's not related at all.

@JordiMForgeFlow JordiMForgeFlow force-pushed the 16.0-imp-fs_storage-store_connection branch from f2ffb04 to 7f810aa Compare February 5, 2024 09:12
@JordiMForgeFlow JordiMForgeFlow force-pushed the 16.0-imp-fs_storage-store_connection branch from 7f810aa to 1f4c9f0 Compare February 6, 2024 13:42
@JordiMForgeFlow JordiMForgeFlow force-pushed the 16.0-imp-fs_storage-store_connection branch from 1f4c9f0 to 46aab5b Compare February 6, 2024 15:02
@lmignon
Copy link
Contributor

lmignon commented Feb 6, 2024

/ocabot merge minor

@OCA-git-bot
Copy link
Contributor

What a great day to merge this nice PR. Let's do it!
Prepared branch 16.0-ocabot-merge-pr-320-by-lmignon-bump-minor, awaiting test results.

@OCA-git-bot OCA-git-bot merged commit f3b8084 into OCA:16.0 Feb 6, 2024
@OCA-git-bot
Copy link
Contributor

Congratulations, your PR was merged at 916060f. Thanks a lot for contributing to OCA. ❤️

# Use a marker file to limit the scope of the LS command for performance.
try:
self._check_connection(self.__fs)
except Exception as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some tests, we are still facing connections issues and the current implementation doesn't cover it: OSError: Socket is closed
This errors arrives only after some time (several hours, workers have a high lifespan here and FS object is never recreated meanwhile).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sebalix maybe then we should not store the FS object at all in the slot, and always call to _get_filesystem

Copy link
Contributor

@lmignon lmignon Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The module can also be used in contexts of heavy traffic between odoo and external filesystems. This is particularly the case when it is used to manage Odoo's default attachment storage. In such circumstances, the cost of the reconnection required each time the fs proxy would be recreated is not negligible. Perhaps it would be a good idea to set up a 'keep-alive' mechanism via a cron running every 'X' minutes...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmignon we could do that or putting it as a configuration in the FS storage to decide if you want to check the connection everytime or not. The cron approach however seems to be the solution covering both cases, and one can decide to run it more/less frequently. What do you think @sebalix ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure to understand the cron approach regarding this in-memory object?
But yes, the only fix that is working for us right now is to return a new FS object every time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JordiBForgeFlow thanks for your answer! I checked again.
This PR was not forward ported to 17.0 branch that we were using. Any change to get this done quickly? I will check tomorrow if I can prepare a forward port myself but i am not so familiar with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JordiMForgeFlow okay you are right. I was confused because the def _check_connection method looks different but that is maybe due to different PR later.

So this brings me back to: We still have the issue with closed Socket and we have a test connection method selected for all FS Storage entries.

But I will further investigate it. Thanks for helping!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@john-herholz-dt unfortunately I am not running this in any V17 project, so I cannot assure you that the changes they did in V17 are still handling correctly the issue.

Maybe you could try changing the _check_connection method back to how it is in V16 to see if it works

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@john-herholz-dt you could be interested by #487 In this PR I add the possibility to disable the cache of the fsspec client in odoo since fsspec already manage it's own cache. IMO, the cache into Odoo should be removed into the next version.

@JordiMForgeFlow @john-herholz-dt AFAIK 16, 17 and 18 branches are aligned. I regularly do back and forward ports between branches and every time I merge a PR. (But I'm not infallible...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[16.0] fs_storage + EDI: connection closed after some time

7 participants