Skip to content
This repository was archived by the owner on Jan 22, 2019. It is now read-only.

feat: use access tokens for authentication#245

Merged
ijsnow merged 6 commits intomasterfrom
access-tokens
Oct 17, 2018
Merged

feat: use access tokens for authentication#245
ijsnow merged 6 commits intomasterfrom
access-tokens

Conversation

@ijsnow
Copy link

@ijsnow ijsnow commented Oct 14, 2018

This PR changes request logic to use access tokens. This means that admins won't have to add their code hosts to the corsOrigin anymore.

This fixes corsOrigin errors because we are able to make requests to the API from the options menu. Any request from the options menu will create an access token. I landed on creating a token on each request rather than explicitly creating a token when entering and saving a new URL because multiple requests go out at different times and it was difficult to make this happen before other requests go out. The easiest way was to get to the root of each request.

Without this architecture change/decision, some requests would fail with a 401 which could cause problems. You would have to click save and then refresh the page for everything to work properly.

@felixfbecker this is an rxjs-y one so I'd appreciate a review from that perspective.

Testing plan:

Spin up a sourcegraph instance and put a private repo on there. Ensure that corsOrigin is not set in the site config. Point your browser extension at http://localhost:3080 or wherever the instance you just spun up is hosted at and click save. Go to http://localhost:3080/users/isaac/account/tokens and ensure there is a token there for your browser extension (chrome/firefox). Ensure code intelligence works on the private repository you added.

I have tested on:

  • Chrome
  • Firefox
  • Phabricator Bundle <- I'll test this before merging in because there are potential code paths that could break this. EDIT: Tested.

Copy link

@chrismwendt chrismwendt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need to be logged-in to create an access token, don't you need corsOrigin set? Maybe I'm completely missing something, but it seems to me that this change reduces the set of requests that require corsOrigin but doesn't obviate corsOrigin entirely, which means the site admin will still need to set corsOrigin. Edit disregard this, I realized that access tokens are only created on the options page which doesn't need corsOrigin.

export const getAccessToken = (url: string): Observable<string | undefined> =>
new Observable(observer => {
storage.getSync(items => {
observer.next(items.accessTokens[url])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Careful, what if accessTokens is not defined?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to migration. It'll be defined.

return requestGraphQL(ctx, request, variables, DEFAULT_SOURCEGRAPH_URL, retry, authError)
})
if (headers && headers.authorization) {
// If we got a 401 with a token, get rid of the and

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo somewhere

}
return data.createAccessToken.token
}),
share()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain what share() does in this context? I skimmed over an article and docs about it, but I'm still not sure how/why it's useful.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this example to be helpful for understanding the effect of share(). It essentially shares the side effects (access token creation) from each step in the pipeline before this share. Without it, even with switchMap cancellation, there would be two access tokens created if two requests were piped through here concurrently.

@chrismwendt
Copy link

Oh, I just realized that the access token is created from the options page, which doesn't require corsOrigin.

// the server's CORS rules. (See
// https://stackoverflow.com/questions/17478731/whats-the-point-of-the-x-requested-with-header for more
// info.)
// - Using application/json as the Content-Type or Accept result in CORS blocking the request.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this third bullet point intended to be one of the things required for the Sourcegraph server to accept the request? I don't see how it relates. BTW, this whole comment is nice 👍

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sqs wrote the comment (its on the left side of the diff too)

// - Firefox does NOT include any Origin header, so we need to send an "X-Requested-With" header.
const needsCORSHeader = getPlatformName() === 'firefox-extension' || isPhabricator
if (needsCORSHeader) {
headers.append('X-Requested-With', `Sourcegraph - ${getPlatformName()} v${getExtensionVersionSync()}`)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be needsXRequestedWithHeader?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be, just kept the old variable name(its on the left side of the diff).


return accessTokenCreator
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I need to first understand what share() does, but I have a lot of questions about buildAccessTokenCreator:

  • Would a pattern like storing a map from user ID to Promise<Token> work in this case?
  • How does this actually ensure that two requests for an access token for the same user only result in one access token being created?
  • How does this handle the case where two requests for access tokens for two different users?

I think some of my confusion is caused by not knowing which Observables are expected to emit once or more than once, which would be solved by using Promises, but I understand those can't be canceled.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me the goal is to simply memoize the request function. If that is the case, I would use memoizeObservable: https://sourcegraph.sgdev.org/github.com/sourcegraph/browser-extensions/-/blob/src/shared/util/memoize.tsx#L38:17

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the goal is not to just memoize the request. It's already saved in browser.storage.sync which can be used across page loads/tabs/even devices if they are signed into their browser. memoizeObservable just keeps an object in memory.

The goal of this is to only ever send out one request to create an access token at a time. memoizeObservable would be fine if we never had concurrent requests going out as we do on options page load.

Below is a diagram of what happens with only using memoizeObservable for concurrent requests(or if we were to use promises without some other similar and much more complicated implementation of the same thin - more complicated because rxjs is much easier to treat as a sharable pipeline).

screen shot 2018-10-15 at 11 07 11 am

What buildAccessTokenCreator does is create a single source observable that each concurrent request can use to get a single access token from. Below is a diagram of how it looks with concurrent requests.

screen shot 2018-10-15 at 11 14 24 am

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrismwendt

Would a pattern like storing a map from user ID to Promise work in this case?

It could work but would have a lot more complicated code and lose some benifits of rxjs. See my above comment.

How does this actually ensure that two requests for an access token for the same user only result in one access token being created?

switchMap cancels old requests from previous inputs and share shares the single result with observer.

How does this handle the case where two requests for access tokens for two different users?

It doesn't. Could you explain a case where we would need to handle that?

Copy link

@felixfbecker felixfbecker Oct 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't understand why memoizeObservable doesn't work.

The goal of this is to only ever send out one request to create an access token at a time. memoizeObservable would be fine if we never had concurrent requests going out as we do on options page load.

That's exactly what memoizeObservable does. If a request is made, and then another is made (even before the first one returned), the second one will get the result of the first (once it's there). Where does this behavior deviate from what you want in this situation?

How does this handle the case where two requests for access tokens for two different users?

It doesn't. Could you explain a case where we would need to handle that?

I was confused by this too, because even if the application can never get into that state, reading this function suggests that the returned function could be called multiple times with different values for userID, which would result in unexpected behavior.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize that there was logic for that in memoizeObservable but I can tell you that it wasn't working without this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was confused by this too, because even if the application can never get into that state, reading this function suggests that the returned function could be called multiple times with different values for userID, which would result in unexpected behavior.

Would renaming it to be source reduce your confusion?

Copy link
Author

@ijsnow ijsnow Oct 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the misunderstanding @felixfbecker. I just tried again and this works the same as memoizeObservable. I must have not thought to use it before because theres nothing in the definition of the term memoize that says memoizeObservable will wait for other requests for the same thing and there's nothing in the comments about it. I'll push a new commit using it.

I read your initial comment on this above as "oh he hasn't heard about the memoizeObservable function yet so I'll just tell him to use it instead" which wasn't very helpful as its used all over all Sourcegraph code bases so I'm very aware of it. When giving reviews, I like to assume there was a reason for everything.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Promise-only solution (would this work?):

const buildAccessTokenCreator = () => {
  let token: Promise<string> | undefined
  return userID => {
    if (!token) {
      token = queryGraphQLToGetToken(userID)
    }
    return token
  }
}

This still leaves the issue of why this builder takes a userID. Maybe this builder would make more sense as a combinator that takes some function to run and makes sure it's only called once, ever.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably would work although it still wouldn't fit into the code around it very nicely.

Anyway, I don't have a builder anymore. Well, I do but its memoizeObservable. And it takes a userID because its a parameter in the graphql mutation.

Copy link

@felixfbecker felixfbecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some high-level explanation in the PR description or commit message what this code change does would be helpful.
Here's my understanding from reading the code (which is an inefficient and error-prone way to distill this information): On every Sourcegraph API request, if we don't have an access token, we make an API request to the Sourcegraph API to get an access token and use that. But I feel like I'm missing something? How does exactly does this solve the CORS problem?

observer.complete()
})
)
})
Copy link

@felixfbecker felixfbecker Oct 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this operation (storage.getSync()/storage.setSync()) is not cancellable anyway I would use an async function/Promises here, e.g.

switchMap(async token => {
    const { accessTokens } = await new Promise(resolve => storage.getSync(resolve))
    await new Promise(resolve => storage.setSync({ accessTokens: { ...accessTokens, [url]: token } }, resolve))
	return token
})

Note how this code is slightly safer too because it protects against potential exceptions raised within the getSync() and setSync() callbacks.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the await at the bottom supposed to be the return?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried and the resulting code feels weird to me. It mixes patterns in one function declaration.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't care strongly about this, but I'd say it uses the right pattern in the right context. It's bad to mix equivalent patterns like callbacks and Promises, but Observables and Promises are different patterns (note that the current state mixes the callback pattern with Observables). Observables represent 0..n values, Promises a single value (or an operation / side effect sequence). It is therefor totally okay to e.g. map each single value of the 0..n values of an Observable to a single other value represented by a Promise. That's why switchMap etc all support returning a Promise/taking an async function.
Using the Observable constructor on the other hand is very uncommon inside a switchMap callback, it is usually only used for functions when needed. Usually you'd use an operator like bindCallback (which would also remove the mixing of callbacks and Observables and have the same protection benefit I pointed out in my previous comment).
The goal of the function here is more causing side effects than returning a stream of values, which Promises are more suitable for than Observables.

You are free to decide, just wanted to give some perspective 🙌

observer.complete()
})
)
})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here


return accessTokenCreator
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me the goal is to simply memoize the request function. If that is the case, I would use memoizeObservable: https://sourcegraph.sgdev.org/github.com/sourcegraph/browser-extensions/-/blob/src/shared/util/memoize.tsx#L38:17

* token. i.e. `createAccessToken` and the `fetchCurrentUser` used to get the
* user ID for `createAccessToken`.
*/
useAccessToken = true,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point I would make requestGraphQL take an options object because there are now two boolean positional parameters that are easy to switch up

* Whether or not to use an access token for the request. All requests
* except requests used while creating an access token should use an access
* token. i.e. `createAccessToken` and the `fetchCurrentUser` used to get the
* user ID for `createAccessToken`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Positional parameters are documented with @param


return headers
}),
// rxjs ajax seems to not like the Header type so we should reconstruct it as a plain object.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then why do we use Headers in the first place?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was having trouble getting this to work without using it. I would prefer to use it anyways. There has to be a reason why Header is in the standard.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah looks like Headers actually has some magic around CORS in it. We might want to consider just moving away from rxjs ajax and use the fetch API with abortable-rx

return !hover || !hover.contents || (Array.isArray(hover.contents) && hover.contents.length === 0)
}

const createRequest = (url: string, path: string, requests: any[]) =>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be helpful to have a return type here (any named function really)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createRequest to me sounds like a factory function for a Request object. I think an action verb like request() would make more sense, since none of our API request functions use a create prefix

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the path parameter confusing, as this always gets passed the method and is only put to the path for debug info. At first I thought below it was a bug because the method was passed to a parameter named path

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change it to method but I don't find it that confusing because it becomes part of the path.

withCredentials: true,
body: JSON.stringify(requests),
async: true,
}).pipe(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice cleanup

this.setState({ site })
},
() => {
this.setState({ site: undefined })

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better if site had three states, undefined (loading), Error (unable to connect) and ISite (connected). Currently "Unable to connect" always flashes when loading the page because undefined is both error and loading state

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is existing logic(left side of the diff) and this component will be removed soon. We are re-doing the options menu soon.

import { Observable, of } from 'rxjs'
import { filter, map, switchMap } from 'rxjs/operators'

import { isDefined } from '@sourcegraph/codeintellify/lib/helpers'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an internal module which should not be imported. Only the index file is public API, everything else may break in minor/patch versions. The utility is a one-liner so is not worth code-sharing and can be copied.

@ijsnow
Copy link
Author

ijsnow commented Oct 15, 2018

@felixfbecker

Some high-level explanation in the PR description or commit message what this code change does would be helpful.
Here's my understanding from reading the code (which is an inefficient and error-prone way to distill this information): On every Sourcegraph API request, if we don't have an access token, we make an API request to the Sourcegraph API to get an access token and use that. But I feel like I'm missing something? How does exactly does this solve the CORS problem?

Your high level understanding is correct. This gets around the corsOrigin problem because the first requests to a Sourcegraph instance from the browser extension will always be from the options menu which is able to get around CORS via the same bypass logic we used to have for all requests coming from the extension.

I've updated the PR description to explain that.

@felixfbecker
Copy link

Thanks for the explanation. Can it ever happen that the request is not the first one?

@ijsnow
Copy link
Author

ijsnow commented Oct 15, 2018

Nope, its always the first. Clicking save kicks off a request which will handle it.

}`
}`,
undefined,
undefined,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use an options object

@ijsnow ijsnow merged commit 6acd186 into master Oct 17, 2018
@ijsnow ijsnow deleted the access-tokens branch October 17, 2018 18:35
@sourcegraph-bot
Copy link

🎉 This PR is included in version 1.17.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants