Closes #4: Add datasource health extension.#5
Conversation
625e3f2 to
761ae19
Compare
761ae19 to
0da7110
Compare
jezdez
left a comment
There was a problem hiding this comment.
Phew, lots of comments, but this goes in the right decision!
I just realized that we're trying to figure out so many details of Redash extensions and how they can be written that I'm totally using this as a test bed. Mea culpa for the extent of feedback 🙏
| secure: gTJ7383nAUG0OgXqrQMSc5o9HdCqmqGRqZrz1v1E4FQq+CK3Wz5HGHSBbk6eYLYq86078fBhcxESM05A4h+pq1zW08cKPmraHJm0jtew8bn/qUYE+ML9hsFc7DQsuL+0ft2ZdYJxOyH3PAS7OKg6iG9+cvU3vzpMBV0GFVJLhb7N8XVY4KOe7Muok/C2vpAiTiSy52VssjjyZFB6zdAcx8rMwbNP3hxXs8dXnaXkje4lpTIz4z5BRYaOHgLpFhZ6GEwK8cHXwJAUj9gX1be7iHzXULKQaTgjcdtmXcIKvuU8zN72TgDBoFq6GnmyUpIeJZ9oBF7jdYN1UOQf+D1rUef4ZEmqNtMk+3wD4eASH4xZ6qnqeRWWQ+RPpLjBa2oKu1j4ii3+WmB9PJtB+VzJeQH0Z92clBiVdXagmYtbH7MOXDy4icoELJ+vbA7Ww2rGj/tbBWRP/nE72nHPrk2mc3W7uKRk7V36zINZOF9t5hUHAB8FvpVzVlwpP6VsX/cDVm7hNxdUxNa43WfsljewNN6fe6wF/KLQjU+UjELKzgmGY9ircBnOnMGzv48DtyZ7Caq6IG/NRl/UDvN8ZYwIlswgUAuQ4PrSRSplukglyYnwlg+7KygUlmSeZWpLRADBOa/dkUEZFyYG71iNURCEcwzh3NVWaNTv+JKQzyfAccY= | ||
| on: | ||
| tags: true | ||
| repo: mozilla/redash-stmo |
There was a problem hiding this comment.
You want to additionally add a Python version as an added condition here so it's not deploying for all the Python environments when a tag is created.
| @@ -0,0 +1,3 @@ | |||
| FROM redash/redash:latest | |||
| COPY . ./extensions | |||
| ENTRYPOINT ["/app/bin/docker-entrypoint"] No newline at end of file | |||
There was a problem hiding this comment.
For the Python entrypoints to work correctly, we'll have to install the redash-stmo package and not only mount the directory under the dist-packages directory.
So I suggest to do the following for the Dockerfile:
- Add a few env vars that are needed later:
ENV PYTHONUNBUFFERED=0 \
PYTHONPATH=/app/ \
REDASH_LOG_LEVEL="info" \
REDASH_REDIS_URL=redis://redis:6379/0 \
REDASH_DATABASE_URL=postgresql://postgres@postgres/postgres
PYTHONPATH will later allow the test suite to import the redash package even if the redash-stmo source code is not at the same location. The reason for that is that we should no mix the Redash code with the extension code base via copying to prevent accidental overwriting etc.
COPY . /redash-stmoto copy the redash-stmo codebase to the image (which would allow us to bake redash-stmo images that we could push to Dockerhub for easier upstream testing etc later)RUN pip install --editable /redash-stmoto create the required package metadata in the global Python environment and add a link to the copied files under /redash-stmo without copying the whole directory to the site-packages directory.
| export REDASH_DATABASE_URL="postgresql://postgres@postgres/tests" | ||
|
|
||
| if [ $# -eq 0 ]; then | ||
| TEST_ARGS=tests/extension_tests |
There was a problem hiding this comment.
That TEST_ARGS value needs to be /redash-stmo
| ;; | ||
| manage) | ||
| shift | ||
| exec /app/manage.py $* |
| volumes: | ||
| - "./src/redash_stmo:/usr/local/lib/python2.7/dist-packages/redash_stmo" | ||
| - "./tests:/app/tests/extension_tests" | ||
| - "./bin:/app/bin" |
There was a problem hiding this comment.
Now that we've installed redash-stmo as an editable requirement in the container via the Dockerfile during image build time, we can - on top of that - just switch out the /redash-stmo directory with a volume that points to the current working directory on the host machine during development.
So instead of the three volumes we just need:
- $PWD:/redash-stmo.
That'll mount the whole current working directory of the host machine under /redash-stmo and will make the volume to mount it under the dist-packages directory not needed.
Running pytests inside the container is as simple as running pytest /redash-stmo in the /app directory (since we've added the redash directory /app to the PYTHONPATH so that we can import redash during the tests).
| status['data_sources'] = json.loads(redis_connection.get('data_sources:health') or '{}') | ||
| return status | ||
|
|
||
| def task_info(app=None): |
There was a problem hiding this comment.
Rename this function in a way matching the entrypoint as discussed in the setup.py file above. This must take a Flask app to enable the modification of the status_api view function as discussed below.
|
|
||
|
|
||
| @celery.task(name="redash_stmo.datasource_health.health.health_status", time_limit=90, soft_time_limit=60) | ||
| def health_status(): |
There was a problem hiding this comment.
Let's rename this to update_health_status and rename the update_health_status function above to store_health_status.
| import json | ||
| import time | ||
| import redash | ||
| import settings |
There was a problem hiding this comment.
No relative implicit imports please, use instead: from . import settings to make this code Python 3 compatible.
| # "1": "select 1;", | ||
| # "5": "select 1;" | ||
| # } | ||
| CUSTOM_HEALTH_QUERIES = dict_from_string(os.environ.get("REDASH_CUSTOM_HEALTH_QUERIES", "")) |
There was a problem hiding this comment.
I'm not sure if it's feasible to store full dicts in strings on the deploy system. Could we iterate over the os.environ in the code that uses that and look for a prefix instead? Basically have REDASH_CUSTOM_HEALTH_QUERY_1, REDASH_CUSTOM_HEALTH_QUERY_5, REDASH_CUSTOM_HEALTH_QUERY_1234 etc?
| task_info = { | ||
| "task": health_status, | ||
| "interval_in_seconds": settings.HEALTH_QUERIES_REFRESH_SCHEDULE | ||
| } |
There was a problem hiding this comment.
I believe this is a leftover from your previous plan to have task entrypoints that are automatically registered. I'm sorry I didn't go into more detail after the last meeting we talked about, the pattern I thought of is the one Redash uses for enabling periodic tasks.
Basically in the redash/worker.py file it'll check the settings whether it should add that specific task to the schedule.
What I propose is that we use the same pattern to define a similar setting in redash_stmo/settings.py and then update the celery config on demand in the entrypoint callback, basically auto-registering the task when Flask starts up, e.g.:
from redash.worker import celery_schedule
from . import settings
if settings.HEALTH_STATUS_UPDATE_ENABLED:
celery_schedule['update_health_status'] = {
'task': 'redash_stmo.health.update_health_status', # the Celery task
'schedule': timedelta(minutes=settings.HEALTH_QUERIES_REFRESH_SCHEDULE)
}
celery.conf.update(CELERYBEAT_SCHEDULE=celery_schedule)There was a problem hiding this comment.
Actually, this was the pattern I initially tried to use, but unfortunately it doesn't seem to be working. I did some reading and my understanding is that add_periodic_task() (http://docs.celeryproject.org/en/latest/userguide/periodic-tasks.html#entries) can be used to do this, however it needs to have a @app.on_after_configure.connect decorator in order to be processed at any point after initial configuration has been complete. Since add_periodic_task() sets celery.conf.beat_schedule in the background, we would also need to update celery.conf.beat_schedule under the @app.on_after_configure.connect decorator. However, this decorator seems to have no effect when used here in health.py. I suspect this is because the extensions get initialized too late, after the on_after_configure event has already fired and been missed.
Additionally, there seemed to be some stackoverflow comments on this here:
https://stackoverflow.com/questions/41124591/setting-up-periodic-tasks-in-celery-celerybeat-dynamically-using-add-periodic
In order to be able to dynamically manage periodic tasks and reschedule celerybeat at runtime, there’s the django-celery-beat extension that stores the schedule in the Django database, and presents a convenient admin interface to manage periodic tasks at runtime.
Which sounds to me like dynamic celery beat schedule updating only works with the django extension.
This is why I ended up with the design you see here. And on the redash side, I did this: https://github.com/mozilla/redash/pull/444/files#diff-ad93995bb355251c02f87fc02db69351R82
This was the only way I could get it to work. Please let me know if you see something I've missed
There was a problem hiding this comment.
Aah, I should have probably have seen this, since in fact ATMO had the same problems of having the schedule tasks dynamically and in the end I used redbeat for that (mozilla/telemetry-analysis-service#318 and mozilla/telemetry-analysis-service#379), which is a similar thing like the old, now defunct django-celery package. It basically provides a custom scheduler that persists the periodic task list in Redis and a bit of API to add/remove it after startup.
That said I don't think the use cases are the same and if we can solve this with the MUCH smaller solution of yours let's do it.
In Flask the best practice for extensions is to have them register themeselves. So I would suggest to change the code in https://github.com/mozilla/redash/pull/444/files#diff-ad93995bb355251c02f87fc02db69351R82 to not populate the list there but do it in the extension itself, e.g.::
def datasource_health(app):
"""A Redash extension to add datasource health status reporting."""
# Override the default status API view with our extended view
app.view_functions['%s.status_api' % routes.name] = stmo_status_api
# Add the update_health_status task to a list of periodic tasks
if not hasattr(app, 'periodic_tasks'):
app.periodic_tasks = {}
app.periodic_tasks['update_health_status'] = {
'signature': update_health_status.s(),
'schedule': settings.HEALTH_QUERIES_REFRESH_SCHEDULE,
}As you can see, instead of inventing our own return format, I've just used a dictionary of keyword parameters that add_periodic_task expects so we keep the design open for any kind of tasks that Celery supports and other Redash extensions may need in the future.
The use of .s() for the signature parameter is also significant since it'll allow Celery to store information about a task in its task store as metadata instead of as Python callables, which are hard to serialize for example to JSON (the recently new default in Celery). In other words, this will be a great help to stay consistent with Celery API design and by that reduce risk for using tasks like this.
185639f to
109f5c0
Compare
jezdez
left a comment
There was a problem hiding this comment.
Nicely done, thank you! Just a few more questions and the one big comment about the return format of the Redash extension callback and self-registration pattern. Almost there! 🙌
| REDASH_REDIS_URL=redis://redis:6379/0 \ | ||
| REDASH_DATABASE_URL=postgresql://postgres@postgres/postgres | ||
|
|
||
| RUN pip install /redash-stmo |
There was a problem hiding this comment.
This will need to be pip install -e /redash-stmo (or --editable) so the volume mounted under /redash-stmo in development will pick up changes done to the code base.
What -e does is create an "egg-link" file which is just a NIH version of a symlink that is picked up automatically when setuptools is installed.
|
|
||
| @celery.task(name="redash_stmo.health.update_health_status", time_limit=90, soft_time_limit=60) | ||
| def update_health_status(): | ||
| for data_source in models.DataSource.query: |
There was a problem hiding this comment.
So Redash supports different "organizations" and data sources can be assigned to them individually (the Mozilla setup only has one organization, the default one). There is a notion of a "current organization" per request (e.g. /org1/api/.. and /org2/api/..) but since this task is run outside the request cycle, we can't refer to current_org as Redash does all over the place of the request time code. I don't have a suggestion, but just wanted you to be aware that the status API will mix data sources that could be configured to be separate for different organizations users. It's probably a good idea to ask Arik about it when we move this feature upstream.
| logger.info(u"task=update_health_status state=error ds_id=%s runtime=%.2f", data_source.id, time.time() - start_time) | ||
|
|
||
| status = { | ||
| "status": "SUCCESS" if runtime is not None else "FAIL", |
There was a problem hiding this comment.
Let's remove the double negative and write "FAIL" if runtime is None else "SUCCESS",
| test_connection(data_source.query_runner, query_text) | ||
| runtime = time.time() - start_time | ||
| except NotImplementedError: | ||
| logger.warning(u"Unable to compute health status without test query for %s", data_source.name) |
There was a problem hiding this comment.
Remind me, are admins able to define the test query during data source setup? Wondering if this is a warning that can be fixed by admins or if it's enough to logger.info that.
| @@ -0,0 +1,155 @@ | |||
| import os | |||
There was a problem hiding this comment.
Any reason why this is in test and not in tests directory?
There was a problem hiding this comment.
Yes, in test_health.py we have from tests import BaseTestCase which imports from /app/tests. This has a conflict with /redash-stmo/tests so this import does not work. Renaming to test resolves this.
Is there perhaps a more elegant way to fix this?
There was a problem hiding this comment.
Ah right, we add /app to the Python import path in the Dockerfile to bring Redash and redash-stmo together, forgot about that.
I think it should suffice to remove the __init__.py file from the /redash-stmo/test directory and rename it to tests since the directory doesn't need to be a Python package (directory with a __init__.py) for pytest to correctly collect the tests. For initialization pytest uses an own practice to put things in a conftest.py file. There is a lot of documentation about that here.
4ae2092 to
73666a0
Compare
73666a0 to
358b769
Compare
|
Thank you! |
|
Next step would be to release a new version to PyPI (update versions, changelog, push git tag etc) and then a new PR for our Redash fork to use it. |
No description provided.