Skip to content

Conversation

@tirkarthi
Copy link
Contributor

While working on #42700 I found that dagruns in running state without end_date are not returned even with current time sent as end_date. It seems that in the query end_date on being null uses timezone.utcnow() which will have a value greater than the frontend end_date value due to HTTP latency in the order of milliseconds to seconds thus never returning running dagruns and task instances associated with the corresponding dagrun. It's okay to allow dagruns with a start_date and no end_date which usually means they are in running state. Please correct me if I am missing any scenario regarding this.

.where(
DagRun.start_date >= start_date,
func.coalesce(DagRun.end_date, timezone.utcnow()) <= end_date,

cc: @bugraoz93 for #42629

@bugraoz93
Copy link
Contributor

bugraoz93 commented Nov 9, 2024

.where(
DagRun.start_date >= start_date,
func.coalesce(DagRun.end_date, timezone.utcnow()) <= end_date,

This part was lifted and shifted from legacy ui/views.py.
https://github.com/apache/airflow/blob/main/airflow/www/views.py#L3318-L3320

I will have time to check this in detail in 3-4 hours. If you think it is not working as expected, this should be a bug and we should fix it for old versions too.

If you think this part doesn't provide enough latency to the new UI then, it makes sense to make the changes.

@tirkarthi
Copy link
Contributor Author

It seems the default filters in cluster activity add 1 hour to current time as end_date. I am not sure if the same needs to be done in new UI since it presents preset hours like last 1 hour, 8 hours etc. for now where end_date has to be current time in the UI.

const endDate =
searchParams.get(END_DATE_PARAM) ||
// @ts-ignore
moment(now).add(1, "h").toISOString();
const startDate =
searchParams.get(START_DATE_PARAM) ||
// @ts-ignore
moment(endDate).subtract(1, "d").toISOString();

@bugraoz93
Copy link
Contributor

bugraoz93 commented Nov 10, 2024

We should take a step back and clarify the intended use case for the dashboard (old historical_metrics_data). The current lifted and shifted version was designed to show historical data, so it makes sense not to include running jobs in this context. As we transition to the new UI and the dashboard term for this endpoint, we should ensure the endpoint aligns with the updated requirements.

It would be beneficial to get @bbovenzi and @pierrejeambrun’s input to gain the overall vision of the dashboard’s requirements.

@tirkarthi
Copy link
Contributor Author

IMO not showing running dagruns reduces the value of the new dashboard. The running dagruns are present in current cluster activity page too. When I raised the PR I thought it was a bug. It comes to a decision over adding 1hr in UI like cluster activity or changing the API as per the PR to imply dagruns without end_date as running.

@bbovenzi
Copy link
Contributor

Doesn't this always add running dags even when I set my end_date param to be in the past? (ex: I only wanted to look at dag runs from last week)

I think there's a few ways to redo this:

  • make end_date an optional param
  • only filter on start_date, then you capture any dags that ran during that time period, even if they didn't finish within that time period (if the dagrun's end_date is after the date range we also miss it)

@tirkarthi
Copy link
Contributor Author

Doesn't this always add running dags even when I set my end_date param to be in the past? (ex: I only wanted to look at dag runs from last week)

When the dagrun started in the last week after given start_date in the UI and yet to finish with end_date as None for the dagrun it will be present in the API with this change. Before this change it won't be shown.

@bugraoz93
Copy link
Contributor

Doesn't this always add running dags even when I set my end_date param to be in the past? (ex: I only wanted to look at dag runs from last week)

When the dagrun started in the last week after given start_date in the UI and yet to finish with end_date as None for the dagrun it will be present in the API with this change. Before this change it won't be shown.

I dug more after the answers since this was the case in the legacy api. We should check the parameters.py rather than editing the endpoint itself. We should return None here similar to the old version. I think I made it ValueError if end_date is not provided but as @bbovenzi said we can make this value None when it's not provided (optional).

Where we should change:

if not date_to_check:
raise ValueError(f"{date_to_check} cannot be None.")

Where it was before:
if allow_empty is True and not v:
return None

Of course, we should adjust this to make FastAPI accept the value as optional in the endpoint.


-> end_date: DateTimeQuery | None,

I haven't fully tested the changes but from the conversation, this could lead to None values not returning. Because it's not allowing None values at the moment when you pass None. The query should behave the same since there aren't any changes on how we are querying the database 🤔

@bugraoz93
Copy link
Contributor

I can help with the changes and testing if you’d like, @tirkarthi. This issue originated from my change, but if you already have a UI set up for testing, it would be much faster to verify the changes through the UI, as it would show exactly what’s needed. Let me know how you’d like to proceed!

@bbovenzi
Copy link
Contributor

@bugraoz93 You can use this in the UI: #43888

I agree that we should allow end_date be None and then we don't need to worry about constantly changing the end_date in the UI when we set it to now.

@pierrejeambrun
Copy link
Member

pierrejeambrun commented Nov 14, 2024

Doesn't this always add running dags even when I set my end_date param to be in the past? (ex: I only wanted to look at dag runs from last week)

Yes I think we need to pay attention to that case.

Allowing end_date to be none is perfectly fine. The backend can fill it with utcnow() but we need the same utcnow() for comparison ? (if we allow filtering on None, only dags with None as end_date will show, which might not be what we want, this is why I think we should fill the value on the backend side, preventing the front-end from having to manipulate the date, etc...).

Or we can make a case in the Query. We take dags with .end_date == None if end_date param is not provided (i.e None). If it is provided with an actual value, we keep the current implementation. (coalesce None to utcnow and compare)

@tirkarthi
Copy link
Contributor Author

Closing in favour of #44043 . Thanks everyone for the details.

@tirkarthi tirkarthi closed this Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants