Skip to content

Conversation

@utkarsharma2
Copy link
Contributor

@utkarsharma2 utkarsharma2 commented Sep 10, 2024

related: #42111

With the recent fix we added a check that compares a dag.fileloc and dag_directory However, we never resolve the symbolic links in dag.fileloc but we do resolve them for dag_directory. But in helm charts that uses gitSync we use symbolic links like:

drwxrwsrwx 1 root    root  36 Sep 10 12:08 .
drwxrwxr-x 1 airflow root  60 Sep 10 12:04 ..
drwxr-sr-x 1   65533 root 182 Sep 10 12:08 .git
drwxr-sr-x 1   65533 root  80 Sep 10 12:08 .worktrees
lrwxrwxrwx 1   65533 root  51 Sep 10 12:08 repo -> .worktrees/99c104927b4b568d40d2331095c687817a580678

Which results in a discrepancy:

dag.fileloc:  /opt/airflow/dags/repo/src/test_dags_folder.py, 
dag_directory : /opt/airflow/dags/.worktrees/99c104927b4b568d40d2331095c687817a580678/src 

and leads to failure of the check added in the PR. This PR aims to resolve this issue by ignoring the symbolic links in the dag_directory path.

@boring-cyborg boring-cyborg bot added the area:Scheduler including HA (high availability) scheduler label Sep 10, 2024
@utkarsharma2 utkarsharma2 marked this pull request as draft September 10, 2024 19:10
@potiuk potiuk added this to the Airflow 2.10.2 milestone Sep 10, 2024
@potiuk
Copy link
Member

potiuk commented Sep 11, 2024

I think part of the problem is that we cannot resolve the path from symbolic link - because the actual folder that symbolic link points to will be different after every commit.

So we need to find a way how to retain the original symbolic link when we we are parsing the files - but that one might be tricky as it will depend on the way how you actually got to the folder.

Another option is to only compare "relative" parts of the links. The thing is that - at least currently - fileloc / path will always be rooted at DAG_FOLDER - so what we really want to compare is the relative path from that - and a bit of complexity is that it's not entirely true for example dags.

So it's ... tricky.

"""Return the dag_director as a string."""
if isinstance(self._dag_directory, Path):
return str(self._dag_directory.resolve())
# we save dag.fileloc without resolving the symlink in the db, we should be consistent in resolving and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do this at the deactivate stale dag code instead of here

@ephraimbuddy
Copy link
Contributor

I think part of the problem is that we cannot resolve the path from symbolic link - because the actual folder that symbolic link points to will be different after every commit.

So we need to find a way how to retain the original symbolic link when we we are parsing the files - but that one might be tricky as it will depend on the way how you actually got to the folder.

Another option is to only compare "relative" parts of the links. The thing is that - at least currently - fileloc / path will always be rooted at DAG_FOLDER - so what we really want to compare is the relative path from that - and a bit of complexity is that it's not entirely true for example dags.

So it's ... tricky.

Should we revert this change and target the fix for another release?

@potiuk
Copy link
Member

potiuk commented Sep 11, 2024

Should we revert this change and target the fix for another release?

If we can't find solution quickly, likely yes.

But I think what we should likely do is to understand that the abspath can contain ".worktrees" and map them back to the "symbolic link" name and store them in both fileloc and directory - but it will likely need to come with some "special case" understanding - getting the DAG_FOLDER value and if we see that the the path does not contain DAG_FOLDER

dag.fileloc:  /opt/airflow/dags/repo/src/test_dags_folder.py, 
dag_directory : /opt/airflow/dags/.worktrees/99c104927b4b568d40d2331095c687817a580678/src 

smth like (pseudo-code):

if not path.startswith(DAG_FOLDER):
   # symbolic link for git-sync
   split_the_paths
   see where they start to differ (/opt/airflow/dags/ is common)
   see that the next two follow `.worktrees'  + "looks_like_commit_hash"
   replace ".worktrees"  + "commit_hash" with whatever next folder is in the DAG_FOLDER after `/opt/airflow/dags`

That shoudl work if we make an assumption (which I think is pretty valid) that we have git-sync. It will not work, however is someone has symbolic links in their repos which are not git-sync created.

If there are other custom solutions where someone would also use symlinks to switch between folders more "manually" - might not work.

Though I have not looked in details yet - at Airflow Summit, why really we get the problem in the first place - why dag_directory is "resolved" - maybe we can avoid it in general ?

@utkarsharma2
Copy link
Contributor Author

This looks more complicated than I initially thought. I agree with @ephraimbuddy that we should revert this PR and proceed with the release. That way we have more time to fix the issue properly.

@utkarsharma2
Copy link
Contributor Author

utkarsharma2 commented Sep 12, 2024

Though I have not looked in details yet - at Airflow Summit, why really we get the problem in the first place - why dag_directory is "resolved" - maybe we can avoid it in general ?

I was also curious why we do it differently for fileloc and dag_directory. I didn't see anything in the initial PR that introduced this method.

Also, the scheduler command doesn't resolve the symbolic links initially.

@belongwqz
Copy link

first plan in 2.10.2, then 2.10.3, now 2.11.0, why ?

@potiuk
Copy link
Member

potiuk commented Oct 25, 2024

first plan in 2.10.2, then 2.10.3, now 2.11.0, why ?

I guess, because it's difficult and no thig priority, but If you would like to step-in and attempt it - maybe you could help with it @belongwqz ?

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Dec 10, 2024
@github-actions github-actions bot closed this Dec 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Scheduler including HA (high availability) scheduler stale Stale PRs per the .github/workflows/stale.yml policy file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants