Skip to content

Closes #4: Add datasource health extension.#5

Merged
jezdez merged 7 commits intomasterfrom
emtwo/health_status
Aug 8, 2018
Merged

Closes #4: Add datasource health extension.#5
jezdez merged 7 commits intomasterfrom
emtwo/health_status

Conversation

@emtwo
Copy link
Copy Markdown
Contributor

@emtwo emtwo commented Jun 7, 2018

No description provided.

@emtwo emtwo force-pushed the emtwo/health_status branch 16 times, most recently from 625e3f2 to 761ae19 Compare June 26, 2018 18:22
@emtwo emtwo force-pushed the emtwo/health_status branch from 761ae19 to 0da7110 Compare June 26, 2018 18:46
@emtwo emtwo requested a review from jezdez June 26, 2018 19:17
@emtwo emtwo changed the title [WIP] Closes #4: Add datasource health extension. Closes #4: Add datasource health extension. Jun 26, 2018
Copy link
Copy Markdown
Contributor

@jezdez jezdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Phew, lots of comments, but this goes in the right decision!

I just realized that we're trying to figure out so many details of Redash extensions and how they can be written that I'm totally using this as a test bed. Mea culpa for the extent of feedback 🙏

Comment thread .travis.yml
secure: gTJ7383nAUG0OgXqrQMSc5o9HdCqmqGRqZrz1v1E4FQq+CK3Wz5HGHSBbk6eYLYq86078fBhcxESM05A4h+pq1zW08cKPmraHJm0jtew8bn/qUYE+ML9hsFc7DQsuL+0ft2ZdYJxOyH3PAS7OKg6iG9+cvU3vzpMBV0GFVJLhb7N8XVY4KOe7Muok/C2vpAiTiSy52VssjjyZFB6zdAcx8rMwbNP3hxXs8dXnaXkje4lpTIz4z5BRYaOHgLpFhZ6GEwK8cHXwJAUj9gX1be7iHzXULKQaTgjcdtmXcIKvuU8zN72TgDBoFq6GnmyUpIeJZ9oBF7jdYN1UOQf+D1rUef4ZEmqNtMk+3wD4eASH4xZ6qnqeRWWQ+RPpLjBa2oKu1j4ii3+WmB9PJtB+VzJeQH0Z92clBiVdXagmYtbH7MOXDy4icoELJ+vbA7Ww2rGj/tbBWRP/nE72nHPrk2mc3W7uKRk7V36zINZOF9t5hUHAB8FvpVzVlwpP6VsX/cDVm7hNxdUxNa43WfsljewNN6fe6wF/KLQjU+UjELKzgmGY9ircBnOnMGzv48DtyZ7Caq6IG/NRl/UDvN8ZYwIlswgUAuQ4PrSRSplukglyYnwlg+7KygUlmSeZWpLRADBOa/dkUEZFyYG71iNURCEcwzh3NVWaNTv+JKQzyfAccY=
on:
tags: true
repo: mozilla/redash-stmo
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want to additionally add a Python version as an added condition here so it's not deploying for all the Python environments when a tag is created.

Comment thread Dockerfile Outdated
@@ -0,0 +1,3 @@
FROM redash/redash:latest
COPY . ./extensions
ENTRYPOINT ["/app/bin/docker-entrypoint"] No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the Python entrypoints to work correctly, we'll have to install the redash-stmo package and not only mount the directory under the dist-packages directory.

So I suggest to do the following for the Dockerfile:

  • Add a few env vars that are needed later:
ENV PYTHONUNBUFFERED=0 \
    PYTHONPATH=/app/ \
    REDASH_LOG_LEVEL="info" \
    REDASH_REDIS_URL=redis://redis:6379/0 \
    REDASH_DATABASE_URL=postgresql://postgres@postgres/postgres

PYTHONPATH will later allow the test suite to import the redash package even if the redash-stmo source code is not at the same location. The reason for that is that we should no mix the Redash code with the extension code base via copying to prevent accidental overwriting etc.

  • COPY . /redash-stmo to copy the redash-stmo codebase to the image (which would allow us to bake redash-stmo images that we could push to Dockerhub for easier upstream testing etc later)
  • RUN pip install --editable /redash-stmo to create the required package metadata in the global Python environment and add a link to the copied files under /redash-stmo without copying the whole directory to the site-packages directory.

Comment thread bin/docker-entrypoint Outdated
export REDASH_DATABASE_URL="postgresql://postgres@postgres/tests"

if [ $# -eq 0 ]; then
TEST_ARGS=tests/extension_tests
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That TEST_ARGS value needs to be /redash-stmo

Comment thread bin/docker-entrypoint
;;
manage)
shift
exec /app/manage.py $*
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation off

Comment thread docker-compose.yml Outdated
volumes:
- "./src/redash_stmo:/usr/local/lib/python2.7/dist-packages/redash_stmo"
- "./tests:/app/tests/extension_tests"
- "./bin:/app/bin"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we've installed redash-stmo as an editable requirement in the container via the Dockerfile during image build time, we can - on top of that - just switch out the /redash-stmo directory with a volume that points to the current working directory on the host machine during development.

So instead of the three volumes we just need:

- $PWD:/redash-stmo.

That'll mount the whole current working directory of the host machine under /redash-stmo and will make the volume to mount it under the dist-packages directory not needed.

Running pytests inside the container is as simple as running pytest /redash-stmo in the /app directory (since we've added the redash directory /app to the PYTHONPATH so that we can import redash during the tests).

status['data_sources'] = json.loads(redis_connection.get('data_sources:health') or '{}')
return status

def task_info(app=None):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename this function in a way matching the entrypoint as discussed in the setup.py file above. This must take a Flask app to enable the modification of the status_api view function as discussed below.



@celery.task(name="redash_stmo.datasource_health.health.health_status", time_limit=90, soft_time_limit=60)
def health_status():
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rename this to update_health_status and rename the update_health_status function above to store_health_status.

import json
import time
import redash
import settings
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No relative implicit imports please, use instead: from . import settings to make this code Python 3 compatible.

# "1": "select 1;",
# "5": "select 1;"
# }
CUSTOM_HEALTH_QUERIES = dict_from_string(os.environ.get("REDASH_CUSTOM_HEALTH_QUERIES", ""))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it's feasible to store full dicts in strings on the deploy system. Could we iterate over the os.environ in the code that uses that and look for a prefix instead? Basically have REDASH_CUSTOM_HEALTH_QUERY_1, REDASH_CUSTOM_HEALTH_QUERY_5, REDASH_CUSTOM_HEALTH_QUERY_1234 etc?

task_info = {
"task": health_status,
"interval_in_seconds": settings.HEALTH_QUERIES_REFRESH_SCHEDULE
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is a leftover from your previous plan to have task entrypoints that are automatically registered. I'm sorry I didn't go into more detail after the last meeting we talked about, the pattern I thought of is the one Redash uses for enabling periodic tasks.

Basically in the redash/worker.py file it'll check the settings whether it should add that specific task to the schedule.

What I propose is that we use the same pattern to define a similar setting in redash_stmo/settings.py and then update the celery config on demand in the entrypoint callback, basically auto-registering the task when Flask starts up, e.g.:

from redash.worker import celery_schedule
from . import settings

if settings.HEALTH_STATUS_UPDATE_ENABLED:
    celery_schedule['update_health_status'] = {
        'task': 'redash_stmo.health.update_health_status',  # the Celery task
        'schedule': timedelta(minutes=settings.HEALTH_QUERIES_REFRESH_SCHEDULE)
    }
    celery.conf.update(CELERYBEAT_SCHEDULE=celery_schedule)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this was the pattern I initially tried to use, but unfortunately it doesn't seem to be working. I did some reading and my understanding is that add_periodic_task() (http://docs.celeryproject.org/en/latest/userguide/periodic-tasks.html#entries) can be used to do this, however it needs to have a @app.on_after_configure.connect decorator in order to be processed at any point after initial configuration has been complete. Since add_periodic_task() sets celery.conf.beat_schedule in the background, we would also need to update celery.conf.beat_schedule under the @app.on_after_configure.connect decorator. However, this decorator seems to have no effect when used here in health.py. I suspect this is because the extensions get initialized too late, after the on_after_configure event has already fired and been missed.

Additionally, there seemed to be some stackoverflow comments on this here:
https://stackoverflow.com/questions/41124591/setting-up-periodic-tasks-in-celery-celerybeat-dynamically-using-add-periodic

In order to be able to dynamically manage periodic tasks and reschedule celerybeat at runtime, there’s the django-celery-beat extension that stores the schedule in the Django database, and presents a convenient admin interface to manage periodic tasks at runtime.

Which sounds to me like dynamic celery beat schedule updating only works with the django extension.

This is why I ended up with the design you see here. And on the redash side, I did this: https://github.com/mozilla/redash/pull/444/files#diff-ad93995bb355251c02f87fc02db69351R82

This was the only way I could get it to work. Please let me know if you see something I've missed

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aah, I should have probably have seen this, since in fact ATMO had the same problems of having the schedule tasks dynamically and in the end I used redbeat for that (mozilla/telemetry-analysis-service#318 and mozilla/telemetry-analysis-service#379), which is a similar thing like the old, now defunct django-celery package. It basically provides a custom scheduler that persists the periodic task list in Redis and a bit of API to add/remove it after startup.

That said I don't think the use cases are the same and if we can solve this with the MUCH smaller solution of yours let's do it.

In Flask the best practice for extensions is to have them register themeselves. So I would suggest to change the code in https://github.com/mozilla/redash/pull/444/files#diff-ad93995bb355251c02f87fc02db69351R82 to not populate the list there but do it in the extension itself, e.g.::

def datasource_health(app):
    """A Redash extension to add datasource health status reporting."""

    # Override the default status API view with our extended view
    app.view_functions['%s.status_api' % routes.name] = stmo_status_api
 
    # Add the update_health_status task to a list of periodic tasks
    if not hasattr(app, 'periodic_tasks'):
        app.periodic_tasks = {}

    app.periodic_tasks['update_health_status'] = {
        'signature': update_health_status.s(),
        'schedule': settings.HEALTH_QUERIES_REFRESH_SCHEDULE,
    }

As you can see, instead of inventing our own return format, I've just used a dictionary of keyword parameters that add_periodic_task expects so we keep the design open for any kind of tasks that Celery supports and other Redash extensions may need in the future.

The use of .s() for the signature parameter is also significant since it'll allow Celery to store information about a task in its task store as metadata instead of as Python callables, which are hard to serialize for example to JSON (the recently new default in Celery). In other words, this will be a great help to stay consistent with Celery API design and by that reduce risk for using tasks like this.

@emtwo emtwo force-pushed the emtwo/health_status branch from 185639f to 109f5c0 Compare July 19, 2018 20:31
Copy link
Copy Markdown
Contributor

@jezdez jezdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done, thank you! Just a few more questions and the one big comment about the return format of the Redash extension callback and self-registration pattern. Almost there! 🙌

Comment thread Dockerfile Outdated
REDASH_REDIS_URL=redis://redis:6379/0 \
REDASH_DATABASE_URL=postgresql://postgres@postgres/postgres

RUN pip install /redash-stmo
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need to be pip install -e /redash-stmo (or --editable) so the volume mounted under /redash-stmo in development will pick up changes done to the code base.

What -e does is create an "egg-link" file which is just a NIH version of a symlink that is picked up automatically when setuptools is installed.

Comment thread src/redash_stmo/health.py

@celery.task(name="redash_stmo.health.update_health_status", time_limit=90, soft_time_limit=60)
def update_health_status():
for data_source in models.DataSource.query:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So Redash supports different "organizations" and data sources can be assigned to them individually (the Mozilla setup only has one organization, the default one). There is a notion of a "current organization" per request (e.g. /org1/api/.. and /org2/api/..) but since this task is run outside the request cycle, we can't refer to current_org as Redash does all over the place of the request time code. I don't have a suggestion, but just wanted you to be aware that the status API will mix data sources that could be configured to be separate for different organizations users. It's probably a good idea to ask Arik about it when we move this feature upstream.

Comment thread src/redash_stmo/health.py Outdated
logger.info(u"task=update_health_status state=error ds_id=%s runtime=%.2f", data_source.id, time.time() - start_time)

status = {
"status": "SUCCESS" if runtime is not None else "FAIL",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove the double negative and write "FAIL" if runtime is None else "SUCCESS",

Comment thread src/redash_stmo/health.py Outdated
test_connection(data_source.query_runner, query_text)
runtime = time.time() - start_time
except NotImplementedError:
logger.warning(u"Unable to compute health status without test query for %s", data_source.name)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remind me, are admins able to define the test query during data source setup? Wondering if this is a warning that can be fixed by admins or if it's enough to logger.info that.

Comment thread test/test_health.py
@@ -0,0 +1,155 @@
import os
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why this is in test and not in tests directory?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in test_health.py we have from tests import BaseTestCase which imports from /app/tests. This has a conflict with /redash-stmo/tests so this import does not work. Renaming to test resolves this.

Is there perhaps a more elegant way to fix this?

Copy link
Copy Markdown
Contributor

@jezdez jezdez Aug 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, we add /app to the Python import path in the Dockerfile to bring Redash and redash-stmo together, forgot about that.

I think it should suffice to remove the __init__.py file from the /redash-stmo/test directory and rename it to tests since the directory doesn't need to be a Python package (directory with a __init__.py) for pytest to correctly collect the tests. For initialization pytest uses an own practice to put things in a conftest.py file. There is a lot of documentation about that here.

@emtwo emtwo force-pushed the emtwo/health_status branch 2 times, most recently from 4ae2092 to 73666a0 Compare August 1, 2018 16:37
@emtwo emtwo force-pushed the emtwo/health_status branch from 73666a0 to 358b769 Compare August 8, 2018 16:28
@jezdez jezdez merged commit 96694cd into master Aug 8, 2018
@jezdez
Copy link
Copy Markdown
Contributor

jezdez commented Aug 8, 2018

Thank you!

@jezdez jezdez deleted the emtwo/health_status branch August 8, 2018 21:14
@jezdez
Copy link
Copy Markdown
Contributor

jezdez commented Aug 8, 2018

Next step would be to release a new version to PyPI (update versions, changelog, push git tag etc) and then a new PR for our Redash fork to use it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants