API based allow list script by jon-betts · Pull Request #223 · hypothesis/checkmate

jon-betts · 2021-03-12T15:16:32Z

This adds an API and calls it from the script instead of doing things locally. This is of course a lot more complicated, but it has some benefits:

It completely avoids any syncing issues
The changes take effect immediately on the target system
It provides a mechanism we can re-use in future

Review notes

This adds a new API end-point at POST:/ui/api/rule
This accepts JSON:API style bodies like:

{
    "data": {
        "type": "AllowRule",
        "meta": {"url": "http://please.let.me.annotate.com"}
    }
}

It responds with bodies like:

{
    "data": {
        "type": "AllowRule",
        "id": 23464,
        "attributes": {
            "hash": "97342342347abdef23423423",
            "tags": ["manual"],
            "force": false,
            "rule": "http://please.let.me.annotate.com"
        }
    }
}

We use JSON schema to completely vet the data structure to keep the view simple
The JSON schema functions are separated from the view, but they aren't very fancy
After that there's checking and inserting as before
The script has been converted to call this API
Due to the complexity of calling rest APIs, it's a similar size to before
The admin page has been altered to display the correct command to run it

Testing notes

Get a CSV copy of the dev tab: https://docs.google.com/spreadsheets/d/1g82noNwqN8Wzv3CplB_i4YP4iy9F_mhaSRz5xLUgfMY/edit#gid=175010738
Save it as allow_list.csv in the root of Checkmate
make services dev
Visit: http://localhost:9099
Login
Copy the command provided to run the script from the page
You should see it spit out a new CSV file

Possible improvements

We don't respond with the correct content type for JSON:API. It really should be application/vnd.api+json.
We could package up the validation stuff in a way to make it very easy to declare as part of the view definition

jon-betts · 2021-03-12T15:18:19Z

+                    "required": ["url"],
+
+                    "properties": {
+                        "url": {"type": "string", "format": "public-url"}


"format": "public-url" is the most interesting part of this schema.

This is a custom format defined by us. It's totally valid to put whatever you like here in JSON schema. There are a few default formats understood by all standard JSON schema validators, but any they don't understand are just ignored.

I decided to make the url part of the meta for this object rather than the attributes as:

The final rule object doesn't actually include the URL value as provided

This means upon getting a response, you'd get one without this attribute

It is however required to create it

jon-betts · 2021-03-12T15:19:26Z

+            "required": ["type", "meta"],
+
+            "properties": {
+                "type": {"enum": ["AllowRule"]},


The type is mandatory for JSON:API. I chose the class name as the name for the type as:

It's not directly tied to the DB

It might allow us to be fancy one day and automatically pick the right DB class or something

jon-betts · 2021-03-12T15:20:19Z

+<style>
+    code {
+        word-break: break-all;
+    }


The command gets very long with the session JWT, so enable word wrapping so we can copy with without scrolling horizontally.

jon-betts · 2021-03-12T15:21:40Z

+
+    :raise MalformedJSONBody: If the JSON cannot be decoded or the body
+        does not conform to the schema provided
+    """


I'm sure we could do a fancy decorator like we do in h for applying the schema to a view. This works for now and achieves the main aim of making this not view specific. If we find we are using this a lot I think that would be a good time to think about making it slicker (rule of 3?)

I don't think we need to do it but a lighter weight nicety could be to add this as a request method: request.validated_json_body(validator). Similar to the built-in request.json_body property

jon-betts · 2021-03-12T15:24:08Z

+    """Render an HTML version of a blocked URL with explanation."""
+
+    body = get_validated_json_body(request, _ALLOW_RULE_VALIDATOR)
+    url = body["data"]["meta"]["url"]


We know this is totally valid as it's just come through the schema. If this was the wrong shape, we couldn't get this far as an exception would already have been raised.

jon-betts · 2021-03-12T15:24:31Z

+
+    rule = AllowRule(rule=rule_string, hash=hex_hash, tags=["manual"])
+    request.db.add(rule)
+    request.db.flush()  # Make sure an id is allocated before we serialise


The id field is mandatory in JSON:API

I think a nice way to do this might be implement a JSON:API renderer so the view just does return rule and the renderer turns that returned object into JSON. Depending on when Pyramid calls renderers the renderer might not even have to call db.flush() if the transaction has already been commited by then. I suspect they probably get called before transaction commit though. As an example we have a custom SVG renderer in h.

I think that'd be trying way too hard for this PR though

jon-betts · 2021-03-12T15:25:27Z

+
+        return {
+            "type": self.__class__.__name__,
+            "id": self.id,


This isn't flexible here, and I think it shouldn't be unless it has to be. Having every object have it's primary key as id is a very handy convention.

seanh

As discussed on Slack I'd like us to give Marshmallow a try for the validation as I think it's our chosen validation library. If it turns out not to work very nicely for this then we can reject it.

Other than that just some minor suggestions.

I ran it locally and it seemed to work for me, loaded the rules into the DB, and after running the command the sites from the CSV were now allowed

seanh · 2021-03-12T18:09:39Z

+class MalformedJSONBody(JSONAPIException):
+    """The JSON body is malformed in some way."""
+
+


Just so it 400's instead of 500's for a malformed request:

Suggested change

class MalformedJSONBody(JSONAPIException):

"""The JSON body is malformed in some way."""

class MalformedJSONBody(JSONAPIException):

"""The JSON body is malformed in some way."""

status_code = 400

seanh · 2021-03-12T19:18:04Z

    config.add_route("login_callback", "/ui/api/login_callback")
    config.add_route("logout", "/ui/api/logout")

+    config.add_route("add_to_allow_list", "/ui/api/rule", request_method="POST")


I think this should probably be /api/rule/, as I don't think we know that we'll ever add an admin page that calls this API. We may never add admin pages for managing the allow list at all if the priority isn't high enough (which seems likely, if we don't get many requests). Even if we do one day end up adding admin pages, we don't know for sure that they'll end up calling this API from JavaScript. So we might end up with an API that never is called by admin pages, but is just confusingly under /ui/api/. It might be better to just put it under /api/ for now and move it in the future if we decide to.

This'd also mean moving add_to_allow_list.py

seanh · 2021-03-12T19:19:30Z

+    effective_principals=[Principals.STAFF],
+)
+def add_to_allow_list(_context, request):
+    """Render an HTML version of a blocked URL with explanation."""


Suggested change

"""Render an HTML version of a blocked URL with explanation."""

"""Add a new rule to the allow list.."""

seanh · 2021-03-12T19:22:15Z

+
+    :raise MalformedJSONBody: If the JSON cannot be decoded or the body
+        does not conform to the schema provided
+    """


I don't think we need to do it but a lighter weight nicety could be to add this as a request method: request.validated_json_body(validator). Similar to the built-in request.json_body property

seanh · 2021-03-12T19:23:13Z

+    # Check this isn't something really dumb like 'co.uk' which will ruin the
+    # allow list


This check still needs to be added?

seanh · 2021-03-12T19:26:06Z

+    try:
+        # We expect a detection from not being on the allow list, so we'll
+        # remove it, which will trigger a ValueError if it wasn't there
+        reasons.remove(_ALLOW_LIST_DETECTION)
+    except ValueError:
+        raise ResourceConflict("Requested URL is already allowed") from None


Just tweaked the comments to be (I think) a littler clearer:

Suggested change

try:

# We expect a detection from not being on the allow list, so we'll

# remove it, which will trigger a ValueError if it wasn't there

reasons.remove(_ALLOW_LIST_DETECTION)

except ValueError:

raise ResourceConflict("Requested URL is already allowed") from None

try:

# We expect `reasons` to contain a NOT_ALLOWED detection because

# `url` shouldn't be on the allow list yet. Remove it.

reasons.remove(_ALLOW_LIST_DETECTION)

except ValueError:

# The expected NOT_ALLOWED detection wasn't found:

# `url` must already be on the allow list.

raise ResourceConflict("Requested URL is already allowed") from None

seanh · 2021-03-12T19:29:40Z

+    # Don't fail fast, so we get all of the detections
+    checker = request.find_service(URLCheckerService)
+    reasons = list(checker.check_url(url, fail_fast=False))


Making it a bit more explicit what this comment refers to:

Suggested change

# Don't fail fast, so we get all of the detections

checker = request.find_service(URLCheckerService)

reasons = list(checker.check_url(url, fail_fast=False))

checker = request.find_service(URLCheckerService)

# Pass fail_fast=True to get *all* the reasons not just the first one.

reasons = list(checker.check_url(url, fail_fast=False))

seanh · 2021-03-12T19:35:38Z

+
+    rule = AllowRule(rule=rule_string, hash=hex_hash, tags=["manual"])
+    request.db.add(rule)
+    request.db.flush()  # Make sure an id is allocated before we serialise


I think a nice way to do this might be implement a JSON:API renderer so the view just does return rule and the renderer turns that returned object into JSON. Depending on when Pyramid calls renderers the renderer might not even have to call db.flush() if the transaction has already been commited by then. I suspect they probably get called before transaction commit though. As an example we have a custom SVG renderer in h.

I think that'd be trying way too hard for this PR though

jon-betts · 2021-03-17T13:56:32Z

This has been superceded in favour of: #224

Add jsonschema dependency

4fe2d04

jon-betts added the wip label Mar 12, 2021

jon-betts self-assigned this Mar 12, 2021

jon-betts commented Mar 12, 2021

View reviewed changes

jon-betts requested a review from seanh March 12, 2021 15:33

jon-betts force-pushed the api-based-add-to-allow branch from c3199d0 to 5852bd4 Compare March 12, 2021 15:39

jon-betts mentioned this pull request Mar 12, 2021

Developer-friendly method for managing the allow-list #195

Closed

4 tasks

seanh suggested changes Mar 12, 2021

View reviewed changes

Jon Betts added 3 commits March 15, 2021 15:12

Add a new end-point for checking and adding rules to the DB

cc84fc4

Change the script to use the end-point instead of doing it locally

35f3a5a

Add a command to the admin page to show how to call the script

75fe961

jon-betts mentioned this pull request Mar 15, 2021

API based add to allow list script (marshmallow) #224

Merged

jon-betts force-pushed the api-based-add-to-allow branch from 9fba3ad to 75fe961 Compare March 15, 2021 15:44

jon-betts closed this Mar 17, 2021

		class MalformedJSONBody(JSONAPIException):
		"""The JSON body is malformed in some way."""

	"""Render an HTML version of a blocked URL with explanation."""
	"""Add a new rule to the allow list.."""

		# Check this isn't something really dumb like 'co.uk' which will ruin the
		# allow list

Conversation

jon-betts commented Mar 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review notes

Testing notes

Possible improvements

Uh oh!

jon-betts Mar 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seanh left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jon-betts commented Mar 17, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jon-betts commented Mar 12, 2021 •

edited

Loading

jon-betts Mar 12, 2021 •

edited

Loading