Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 10 additions & 2 deletions .github/workflows/artifact-tests.yml
Original file line number Diff line number Diff line change
@@ -1,13 +1,21 @@
name: artifact-unit-tests
on: push
on:
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also updating these to match our filters in unit-tests.yml

push:
branches:
- master
paths-ignore:
- '**.md'
pull_request:
paths-ignore:
- '**.md'

jobs:
build:
name: Build

strategy:
matrix:
runs-on: [ubuntu-latest, windows-latest, macOS-latest]
runs-on: [ubuntu-latest, windows-latest, macos-latest]
Copy link
Copy Markdown
Contributor Author

@konradpabjan konradpabjan May 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of our public documentation has macos so I'm changing it just for consistency. A GitHub user actually brought this up in one of my other PRs so I'm looking out for this now.

https://help.github.com/en/actions/reference/virtual-environments-for-github-hosted-runners#supported-runners-and-hardware-resources

https://github.com/actions/virtual-environments#available-environments

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are case-insensitive and macOS is more correct than macos based on how Apple names it

fail-fast: false

runs-on: ${{ matrix.runs-on }}
Expand Down
4 changes: 4 additions & 0 deletions packages/artifact/docs/implementation-details.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ Warning: Implementation details may change at any time without notice. This is m

![image](https://user-images.githubusercontent.com/16109154/79765587-19522b00-8327-11ea-9679-410bb10e1b13.png)

During artifact upload, gzip is used to compress individual files that then get uploaded. This is used to minimize the amount of data that gets uploaded which reduces the total amount of HTTP calls (upload happens in 4MB chunks). This results in considerably faster uploads with huge performance implications especially on self-hosted runners.

If a file is less than 64KB in size, a passthrough stream (readable and writable) is used to convert an in-memory buffer into a readable stream without any extra streams or pipping.

## Retry Logic when downloading an individual file

![image](https://user-images.githubusercontent.com/16109154/78555461-5be71400-780d-11ea-9abd-b05b77a95a3f.png)
Expand Down
34 changes: 20 additions & 14 deletions packages/artifact/src/internal/upload-http-client.ts
Original file line number Diff line number Diff line change
Expand Up @@ -208,25 +208,30 @@ export class UploadHttpClient {
// for creating a new GZip file, an in-memory buffer is used for compression
if (totalFileSize < 65536) {
const buffer = await createGZipFileInBuffer(parameters.file)
let uploadStream: NodeJS.ReadableStream

//An open stream is needed in the event of a failure and we need to retry. If a NodeJS.ReadableStream is directly passed in,
// it will not properly get reset to the start of the stream if a chunk upload needs to be retried
let openUploadStream: () => NodeJS.ReadableStream

if (totalFileSize < buffer.byteLength) {
// compression did not help with reducing the size, use a readable stream from the original file for upload
uploadStream = fs.createReadStream(parameters.file)
openUploadStream = () => fs.createReadStream(parameters.file)
isGzip = false
uploadFileSize = totalFileSize
} else {
// create a readable stream using a PassThrough stream that is both readable and writable
const passThrough = new stream.PassThrough()
passThrough.end(buffer)
uploadStream = passThrough
openUploadStream = () => {
const passThrough = new stream.PassThrough()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this was the existing code, but why do we need a stream that is both readable and writable for upload?

Copy link
Copy Markdown
Contributor Author

@konradpabjan konradpabjan May 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an implementation detail, we have an in-memory buffer and we have to convert it somehow to a readable stream. A passthrough stream is used to get the job done without any extra pipping or secondary streams. During the actual upload, it's treated only as a readable stream so it makes no difference.

I think this is where I originally got the idea from: https://stackoverflow.com/questions/16038705/how-to-wrap-a-buffer-as-a-stream2-readable-stream

I recall experimenting with a bunch of other techniques but they all ended up being considerably more complex.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

passThrough.end(buffer)
return passThrough
}
uploadFileSize = buffer.byteLength
}

const result = await this.uploadChunk(
httpClientIndex,
parameters.resourceUrl,
uploadStream,
openUploadStream,
0,
uploadFileSize - 1,
uploadFileSize,
Expand Down Expand Up @@ -296,11 +301,12 @@ export class UploadHttpClient {
const result = await this.uploadChunk(
httpClientIndex,
parameters.resourceUrl,
fs.createReadStream(uploadFilePath, {
start,
end,
autoClose: false
}),
() =>
fs.createReadStream(uploadFilePath, {
start,
end,
autoClose: false
}),
start,
end,
uploadFileSize,
Expand Down Expand Up @@ -335,7 +341,7 @@ export class UploadHttpClient {
* indicates a retryable status, we try to upload the chunk as well
* @param {number} httpClientIndex The index of the httpClient being used to make all the necessary calls
* @param {string} resourceUrl Url of the resource that the chunk will be uploaded to
* @param {NodeJS.ReadableStream} data Stream of the file that will be uploaded
* @param {NodeJS.ReadableStream} openStream Stream of the file that will be uploaded
* @param {number} start Starting byte index of file that the chunk belongs to
* @param {number} end Ending byte index of file that the chunk belongs to
* @param {number} uploadFileSize Total size of the file in bytes that is being uploaded
Expand All @@ -346,7 +352,7 @@ export class UploadHttpClient {
private async uploadChunk(
httpClientIndex: number,
resourceUrl: string,
data: NodeJS.ReadableStream,
openStream: () => NodeJS.ReadableStream,
start: number,
end: number,
uploadFileSize: number,
Expand All @@ -365,7 +371,7 @@ export class UploadHttpClient {

const uploadChunkRequest = async (): Promise<IHttpClientResponse> => {
const client = this.uploadHttpManager.getClient(httpClientIndex)
return await client.sendStream('PUT', resourceUrl, data, headers)
return await client.sendStream('PUT', resourceUrl, openStream(), headers)
}

let retryCount = 0
Expand Down