Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ Environment variables:
| Name | Effect | Example |
|------|--------|---------|
| `CHECKMATE_BLOCKLIST_URL` | Where to download the blocklist online | `https://some-aws-s3.bucket/file.txt` |
| `CHECKMATE_SECRET` | Secret used for signing URLs | `AB823F97FF2E330C1A20`

For details of changing the blocklist see:

Expand Down
6 changes: 6 additions & 0 deletions checkmate/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,12 +69,18 @@ def configure(config, celery_worker=False): # pragma: no cover
if celery_worker:
# Configure sentry
config.add_settings({"h_pyramid_sentry.celery_support": True})

else:
# The celery workers don't need to know about this stuff
config.include("pyramid_jinja2")
config.include("pyramid_services")

# Include the secret for crytographic functions
config.registry.settings["secret"] = os.environ["CHECKMATE_SECRET"]
Comment on lines +78 to +79
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I like the simple approach to config settings. If the CHECKMATE_SECRET envvar is missing the app will simply crash on startup and it'll be pretty obvious why, and this keeps our code simple. h and lms support reading any setting from either the config file or the environment, but I don't think that's really necessary


config.scan("checkmate.views")
config.include("checkmate.routes")
config.include("checkmate.services")

# Configure sentry
config.add_settings(
Expand Down
5 changes: 5 additions & 0 deletions checkmate/routes.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,8 @@ def includeme(config): # pragma: no cover
"""Pyramid config."""
config.add_route("get_status", "/_status")
config.add_route("check_url", "/api/check")

# Serve content from the static/static directory at /ui/static
config.add_static_view("ui/static", "static/static", cache_max_age=3600)
Copy link
Copy Markdown
Contributor

@seanh seanh Jan 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reasoning for the cache_max_age=3600?

I think 3600 is the default value.

Both h and lms do cache_max_age=None:

At first I thought None must disable caching, but actually I think it might allow caching forever, because both h and lms add what look like cache-busting query params onto static asset URLs.

We might want to do that for Checkmate too so that we can let assets be cached forever and (more importantly) so that we don't have trouble with old cached versions of assets taking time to expire. I'm not exactly sure how h and lms do it but Pyramid has a built-in cache-busting thing that changes a cache-busting query param each time the app server restarts: https://docs.pylonsproject.org/projects/pyramid/en/latest/narr/assets.html#cache-busting That's not going to be perfect for us because each app server will cache each asset separately, but it might be good enough? You can also customize the cache buster so we could put in a better mechanism.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until we have some cache busting we can't serve the content with an infinite expiry date. And it's not trivial to add. We have delayed most of the fancy caching for checkmate as not being essential.

If 3600 is the default, it doesn't hurt to make it clear it's happening, and we do want some caching on this content, otherwise it's going to be served live every time.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose, since these are just error pages with a mailto link rather than an interface that you actually use (like an admin page, or fancier error pages), and since the only assets are a CSV file and an SVG, a one-hour delay in someone seeing an updated version of the page or the page loading with different versions of different assets (e.g. new CSV but old SVG) is unlikely to cause many problems.

In more complex cases I think old cached versions can cause breakage. So we'll probably have to deal with cache-busting when we add static assets for the admin pages (unless we use a different static view for those and don't cache them) or if the error pages become much fancier.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we will want it, I just remember it not being trivial to add


config.add_route("present_block", "/ui/block")
8 changes: 8 additions & 0 deletions checkmate/services/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
from checkmate.services.secure_link import SecureLinkService


def includeme(config): # pragma: no cover
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍, yeah there seems little point in testing a method like this. My preference would probably be to add a test that simply calls the method but doesn't assert anything, so it at least tests that the method doesn't crash. But I think no cover on something like this is fair enough

config.register_service_factory(
"checkmate.services.secure_link.factory",
iface=SecureLinkService,
)
79 changes: 79 additions & 0 deletions checkmate/services/secure_link.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
"""Create and check secure links."""

from hashlib import sha256


class SecureLinkService:
"""A class for generating secure links to our own routes."""

TOKEN_ARG = "sec"
VERSION_ARG = "v"
Comment on lines +9 to +10
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these actually TOKEN_KEY and VERSION_KEY? They're actually the keys not the values, and "arg" made met think of values rather than keys. They're awkward to name though, *_KEY doesn't seem wonderfully clear either. I think the code might actually be clearer and just as easy to maintain if it just duplicated the strings "sec" and "v" when needed instead of using constants, the indirection and constant names might just add confusion


def __init__(self, secret, route_url):
"""Create a new BlockURLService object.

:param secret: The secret to sign and check args with
:param route_url: The pyramid `request.route_url` function
"""
self._secret = secret
self._route_url = route_url
Comment on lines +12 to +19
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 It's really nice to inject things like secret directly and decouple SecureLinkService from reading a config setting or envvar itself


def route_url(self, route_name, *elements, **kw):
"""Get a secure version of route to our own app.

This works identically to the Pyramid `Request.route_url()` function
with the exception that the parameters are modified in place to sign
them. This signature can be checked with `is_secure()`.
"""
Comment on lines +21 to +27
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the idea of a generic secure links service and following request.route_url()'s interface


args = kw.get("_query")
if args is None:
raise ValueError("You must provide some parameters to sign")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raise ValueError("You must provide some parameters to sign")
raise ValueError("You must provide some query parameters to sign")

(Just slightly clearer maybe)


# Include this to allow us to cope with breaking changes in the
# future if we ever decide to make some.
args[self.VERSION_ARG] = "1"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so it adds &v=1&sec=*** to the URL's query string. Sounds good 👍

We could get defensive and have it raise if either "v" or "sec" are already in args (could just be a couple of assert's). I think as written the method will behave strangely if args already contains "v" and "sec": it'll ovewrite them, but it'll also have signed the version of the args with the original "sec" so it'll return a secure link that actually won't pass the is_secure() test


# We really should include the elements in the hash, but at present
# we only have one link, and it has no elements.
args[self.TOKEN_ARG] = self._hash_args(route_name, args)
Comment on lines +33 to +39
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it's signing the args and the route_name. Is this so that you can't just pull the args off one URL and paste them onto a different URL? And I guess the idea is that we can actually change the URLs and paths without breaking anything, as long as we keep the (internal) route name the same?


return self._route_url(route_name, *elements, **kw)

def is_secure(self, request):
"""Check if a request was signed correctly.

:param request: A pyramid Request object to check
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
:param request: A pyramid Request object to check
:param request: A Pyramid Request object to check

:rtype: bool
"""
args = request.GET

if self.TOKEN_ARG not in args:
return False

version = args.get(self.VERSION_ARG)
if version != "1":
return False
Comment on lines +51 to +56
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 So an unsigned request is simply considered not secure. Nice


without_token = dict(args)
token = without_token.pop(self.TOKEN_ARG)
Comment on lines +58 to +59
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something about constructing without_token as a dict with the token and then popping the token out, and about getting the token from without_token instead of from args (where it originally came from) struck me as slightly confusing. An alternative might be to have one line to construct without_token as a dict without the token then one line to extract token out of args:

Suggested change
without_token = dict(args)
token = without_token.pop(self.TOKEN_ARG)
without_token = {key: val for key, val in args.items() if key != self.TOKEN_ARG}
token = args[self.TOKEN_ARG]


return token == self._hash_args(request.matched_route.name, without_token)

def _hash_args(self, route_name, args):
digest = sha256()

digest.update(self._secret.encode("utf-8"))
digest.update(route_name.encode("utf-8"))

for key, value in sorted(args.items()):
digest.update(key.encode("utf-8"))
digest.update(value.encode("utf-8"))

return digest.hexdigest()


def factory(_context, request):
return SecureLinkService(
secret=request.registry.settings["secret"], route_url=request.route_url
)
146 changes: 146 additions & 0 deletions checkmate/static/static/css/wrapper-style.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
/**
* Eric Meyer's Reset CSS v2.0 (http://meyerweb.com/eric/tools/css/reset/)
* http://cssreset.com
*/
html, body, div, span, applet, object, iframe,
h1, h2, h3, h4, h5, h6, p, blockquote, pre,
a, abbr, acronym, address, big, cite, code,
del, dfn, em, img, ins, kbd, q, s, samp,
small, strike, strong, sub, sup, tt, var,
b, u, i, center,
dl, dt, dd, ol, ul, li,
fieldset, form, label, legend,
table, caption, tbody, tfoot, thead, tr, th, td,
article, aside, canvas, details, embed,
figure, figcaption, footer, header, hgroup,
menu, nav, output, ruby, section, summary,
time, mark, audio, video {
margin: 0;
padding: 0;
border: 0;
font-size: 100%;
font: inherit;
vertical-align: baseline;
}
/* HTML5 display-role reset for older browsers */
article, aside, details, figcaption, figure,
footer, header, hgroup, menu, nav, section {
display: block;
}
body {
line-height: 1;
}
ol, ul {
list-style: none;
}
blockquote, q {
quotes: none;
}
blockquote:before, blockquote:after,
q:before, q:after {
content: '';
content: none;
}
table {
border-collapse: collapse;
border-spacing: 0;
}

/* page style */
body {
background-color: #ececec;
margin: 0 auto;
color: #333333;
max-width: 960px;
font-family: 'Source Sans Pro', sans-serif;
-webkit-font-smoothing: antialiased;
text-align: center;
}

header {
padding: 100px 0 0 0;
}

header, article {
text-align: left;
margin: 0 auto;
max-width: 600px;
}

p, ol {
font-size: 20px;
line-height: 27px;
padding-bottom: 30px;
max-width: 600px;
margin: 0 auto;
}

strong {
font-weight: 700;
}

h1 {
font-weight: 700;
font-size: 20px;
line-height: 27px;
}

p a, li a {
border-bottom: 1px solid #DCDCDC;
text-decoration: none;
color: #333333;
}

ol {
list-style-type: decimal;
padding-left: 17px;
}

ol li {
font-size: 20px;
line-height: 27px;
padding-bottom: 30px;
}

a:hover {
color: #333333;
}

footer {
padding: 90px 0;
}

#footer-logo {
display:inline-block;
width: 24px;
height:28px;
background: url(data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAADAAAAA4CAQAAAARWLNhAAABgUlEQVR42u2Yz0rDQBDGhwpKe6wUSXspepE+gGfv+gCtPaTv4EMIHrx7K0iepILXmQ2btAQECYmXnoSWStUYxNLdNmoKsyCab2677P6WbybZPwAwapCDESbMEZEzasDH9GP2yT+DxikiXX1iLsgBA+ZoRoHR6dMoAJsDQBHOC0ABKAB/APCCUzwX+3I7Kfm78kic4Q3NWAFkw4rcA7xlA9AAMhSWacAEcI8hU34zv1HfAr4WXhsGuCdMADwlF6c0o3txEewogD0WgDikZ+UIcrnsSbbojaNMr7QzzoNWYU8cZXqnAV6TEjfgUe8Ly8wAnJgGzP8PgELqeJZnUYdCAwAKZXXRLqs6ggfQ1mqvzQ7wLHWUZ7EDhnV11LDODhBd7Q/W5a+iKKgt2oOafin7adPP+x3EZPvNdKezMS6uUL8PYPoibvwpwfhjiPHnnDyifub6+sCluIJyDSDjCvBJtlYOABPZAl6JngoQPeDXMhOM7mdmgtf9tUzwu69nYlP33wEjM9swn4BymAAAAABJRU5ErkJggg==);
background-size: 24px 28px;

font-size:0;
}

@media (max-width: 800px) {
body {padding: 2em 1em;}
header {padding: 50px 0; font-size: 36px; line-height: 42px;}
h1 {font-size: 48px; line-height: 48px;}
p, ol li {font-size: 18px; line-height: 24px;}
}

/* Add exceptions for malicious view */

.icon.danger {
display: block;
height: 64px;
margin-bottom:10px;
background: url('../img/warning.svg');
background-size: 64px 64px;
background-position: center;
background-repeat: no-repeat;
}

body.danger {
color:white;
background-color: #aa0000;
}

Binary file added checkmate/static/static/img/favicon.ico
Binary file not shown.
1 change: 1 addition & 0 deletions checkmate/static/static/img/warning.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
90 changes: 90 additions & 0 deletions checkmate/templates/blocked_page.html.jinja2
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
{% extends "wrapper.html.jinja2" %}

{% block title %}
Content not available
{% endblock %}

{% macro how_to_access() %}
<p>
<strong>To view the annotations on this page:</strong>
</p>

<ol>
<li>
Follow the <a href="https://web.hypothes.is/" target="_blank">instructions
on our homepage</a> to install our browser extension.
</li>
<li>
Visit the page directly at
<a href="{{ blocked_url | e }}" target="_blank">{{ blocked_url }}</a>
and click on the extension icon to see the annotations.
</li>
</ol>
{% endmacro %}

{% macro bad_site_ahead() %}
<p>
<strong>Attackers</strong> on the site may trick you into doing
something <strong>dangerous</strong> like installing software
or revealing your personal information (for example, passwords,
phone numbers or credit cards).
</p>

{# Do what we can to prevent the user from selecting the text #}
<p
style="-moz-user-select: none; -webkit-user-select: none; -ms-user-select:none; user-select:none;-o-user-select:none;"
unselectable="on"
onselectstart="return false;"
onmousedown="return false;"
>
Original URL: {{ blocked_url }}
</p>
{% endmacro %}

{% block body_attrs %}
{% if reason == "malicious" %}
class="danger"
{% endif %}
{% endblock %}

{% block content %}
{% if reason == "malicious" %}
{% set text = {
"heading": " <div class='icon danger'></div>Deceptive site ahead",
"details": "This site is not available through Hypothesis because
it might be hosting <strong>harmful</strong> content."
} %}

{% elif reason == "publisher-blocked" %}
{% set text = {
"heading": "Content not available",
"details": "Unfortunately, the publisher of this page has requested
that we disallow access to it through this Hypothesis service.
That means we can't show you the page you were looking for
right away…"
} %}

{% else %}
{% set text = {
"heading": "Content cannot be annotated",
"details": "Unfortunately, the contents of this page cannot be
annotated through this Hypothesis service."
} %}
{% endif %}

<!-- Block reason: {{ reason }} -->
<header>
<h1>{{ text.heading | safe }}</h1>
</header>

<article>
<p>{{ text.details | safe }}</p>

{% if reason == "malicious" %}
{{ bad_site_ahead() }}
{% else %}
{{ how_to_access() }}
{% endif %}
</article>

{% endblock %}
26 changes: 26 additions & 0 deletions checkmate/templates/wrapper.html.jinja2
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
<!DOCTYPE html>
<html lang="en">
<head>
<title>{% block title %}Via{% endblock %}</title>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width,initial-scale=1.0">
<link href="https://fonts.googleapis.com/css?family=Source+Sans+Pro:400,700" rel="stylesheet" type="text/css">
<link rel="icon" href="/ui/static/img/favicon.ico" type="image/x-icon" />
<link rel="stylesheet" href="/ui/static/css/wrapper-style.css">
<script async src="https://www.google-analytics.com/analytics.js"></script>
<script>
window.ga=window.ga||function(){(ga.q=ga.q||[]).push(arguments)};ga.l=+new Date;
ga('create', 'UA-26026798-1', 'auto');
ga('send', 'pageview');
</script>
</head>
<body {% block body_attrs %}{% endblock %}>
{% block content %}
{% endblock %}
<footer>
<a href="https://hypothes.is/" id="footer-logo">
Hypothesis
</a>
</footer>
</body>
</html>
Loading