Skip to content

Conversation

@ephraimbuddy
Copy link
Contributor

@ephraimbuddy ephraimbuddy commented Nov 9, 2021

closes: #13531

The task instance state can be None and in the API we accept none for null state.

This PR fixes this issue by converting the none to None and improving the query
so that the DB can get this state.


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@boring-cyborg boring-cyborg bot added the area:API Airflow's REST/HTTP API label Nov 9, 2021
@kaxil kaxil requested review from dstandish and uranusjr November 9, 2021 22:32
Comment on lines 81 to 83
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why though? The state is called NONE and the Python counterpart is None. I don’t think we use null anywhere in the public API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will then remove it

Copy link
Member

@uranusjr uranusjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nitpicking comments, the logic looks good to me in general.

ephraimbuddy and others added 3 commits November 11, 2021 14:11
The task instance state can be None and in the API we accept `none` for null state.

This PR fixes this issue by converting the `none` to None and improving the query
so that the DB can get this state.
Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
@ephraimbuddy ephraimbuddy force-pushed the fix-task-instance-endpoint branch from 9b05548 to 2d151c3 Compare November 11, 2021 13:11
@github-actions github-actions bot added the okay to merge It's ok to merge this PR as it does not require more tests label Nov 13, 2021
@github-actions
Copy link

The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.

@ephraimbuddy
Copy link
Contributor Author

cc: @mik-laj

@mik-laj
Copy link
Member

mik-laj commented Nov 14, 2021

I will review it soon, because I would like to check if we are definitely creating objects in the database that do not have a value in the state field. I think we can create objects with a state of "None"(string) and not None, but I need to check that.

@mik-laj
Copy link
Member

mik-laj commented Nov 14, 2021

I wrote the following test cases and it works:

    @provide_session
    def test_should_respond_200_for_none_state_filter(self, session):
        dag_id = 'example_python_operator'
        dag = self.dagbag.get_dag(dag_id)
        run_id = "TEST_DAG_RUN_ID"
        dr = DagRun(
            run_id=run_id,
            dag_id=dag_id,
            run_type=DagRunType.MANUAL,
        )
        session.add(dr)
        session.commit()
        ti = TaskInstance(task=dag.tasks[0], run_id=run_id, state=None)
        ti.dag_run = dr
        session.add(ti)
        session.commit()
        response = self.client.get(
            f"/api/v1/dags/{dag_id}/dagRuns/~/taskInstances?state=none",
            environ_overrides={"REMOTE_USER": "test"},
        )
        assert 1 == session.query(TaskInstance).count()
        assert 1 == session.query(DagRun).count()
        assert response.status_code == 200
        assert [('example_python_operator', 'TEST_DAG_RUN_ID', 'print_the_context', None)] == [
            (d['dag_id'], d['dag_run_id'], d['task_id'], d['state']) for d in response.json['task_instances']
        ]
        response = self.client.get(
            f"/api/v1/dags/{dag_id}/dagRuns/~/taskInstances?state=running",
            environ_overrides={"REMOTE_USER": "test"},
        )
        assert 200 == response.status_code
        assert 0 == len(response.json['task_instances'])

However, I ran into another problem. I could not run the tests as long as the execution_date field was defined in TaskInstanceSchema.
Have you encountered this problem?

$ pytest tests/api_connexion/endpoints/test_task_instance_endpoint.py -k test_should_respond_200_for_none_state_filter -s
....
E           sqlalchemy.exc.InvalidRequestError: Mapper 'mapped class TaskInstance->task_instance' has no property 'execution_date'

/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py:182: InvalidRequestError

@ephraimbuddy
Copy link
Contributor Author

I wrote the following test cases and it works:

    @provide_session
    def test_should_respond_200_for_none_state_filter(self, session):
        dag_id = 'example_python_operator'
        dag = self.dagbag.get_dag(dag_id)
        run_id = "TEST_DAG_RUN_ID"
        dr = DagRun(
            run_id=run_id,
            dag_id=dag_id,
            run_type=DagRunType.MANUAL,
        )
        session.add(dr)
        session.commit()
        ti = TaskInstance(task=dag.tasks[0], run_id=run_id, state=None)
        ti.dag_run = dr
        session.add(ti)
        session.commit()
        response = self.client.get(
            f"/api/v1/dags/{dag_id}/dagRuns/~/taskInstances?state=none",
            environ_overrides={"REMOTE_USER": "test"},
        )
        assert 1 == session.query(TaskInstance).count()
        assert 1 == session.query(DagRun).count()
        assert response.status_code == 200
        assert [('example_python_operator', 'TEST_DAG_RUN_ID', 'print_the_context', None)] == [
            (d['dag_id'], d['dag_run_id'], d['task_id'], d['state']) for d in response.json['task_instances']
        ]
        response = self.client.get(
            f"/api/v1/dags/{dag_id}/dagRuns/~/taskInstances?state=running",
            environ_overrides={"REMOTE_USER": "test"},
        )
        assert 200 == response.status_code
        assert 0 == len(response.json['task_instances'])

However, I ran into another problem. I could not run the tests as long as the execution_date field was defined in TaskInstanceSchema. Have you encountered this problem?

$ pytest tests/api_connexion/endpoints/test_task_instance_endpoint.py -k test_should_respond_200_for_none_state_filter -s
....
E           sqlalchemy.exc.InvalidRequestError: Mapper 'mapped class TaskInstance->task_instance' has no property 'execution_date'

/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py:182: InvalidRequestError

Yeah.
You have an old version of marshmallow-sqlalchemy. Probably 0.23. You need to get the latest image or update the package.

@ephraimbuddy ephraimbuddy added this to the Airflow 2.2.3 milestone Nov 20, 2021
@ephraimbuddy
Copy link
Contributor Author

@mik-laj, can you take a look once more

@potiuk potiuk merged commit f636060 into apache:main Nov 20, 2021
@ephraimbuddy ephraimbuddy deleted the fix-task-instance-endpoint branch November 20, 2021 16:22
jedcunningham pushed a commit that referenced this pull request Dec 7, 2021
)

* Fix task instance api cannot list task instances with None state

The task instance state can be None and in the API we accept `none` for null state.

This PR fixes this issue by converting the `none` to None and improving the query
so that the DB can get this state.

(cherry picked from commit f636060)
@jedcunningham jedcunningham added the type:bug-fix Changelog: Bug Fixes label Dec 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:API Airflow's REST/HTTP API okay to merge It's ok to merge this PR as it does not require more tests type:bug-fix Changelog: Bug Fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Airflow v1 REST List task instances api can not get no_status task instance

5 participants