Skip to content

Rollout#542

Merged
dboskovic merged 2 commits intomainfrom
rollout
Jun 28, 2024
Merged

Rollout#542
dboskovic merged 2 commits intomainfrom
rollout

Conversation

@dboskovic
Copy link
Contributor

@dboskovic dboskovic commented Jun 28, 2024

Introduces Simplified record helpers for memory efficient processing of large datasets and launches a rollout plugin for helping with schema updating.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jun 28, 2024

Walkthrough

This update introduces enhanced file handling and processing functionalities designed for memory-efficient large dataset operations. It includes a new plugin for auto-updating schema changes across live workbooks, simplification of record manipulation with new helper functions, and an additional utility for controlled concurrent asynchronous operations.

Changes

File/Directory Change Summary
.changeset/lemon-trainers-rescue.md Introduced simplified record helpers for memory-efficient processing of large datasets and a schema updating rollout plugin.
plugins/rollout/README.md Introduced the @flatfile/plugin-rollout for auto-updating workbooks' schemas and triggering hooks based on events.
plugins/rollout/rollup.config.mjs Added configuration setup with @flatfile/rollup-config for building a configuration object.
plugins/rollout/src/index.ts Added an auto-update plugin that triggers schema updates for spaces upon new code deployment.
utils/common/src/all.records.ts Refactored getRecordsRaw, added new functions for record operations (createRecords, updateRecords, updateAllRecords, createAllRecords), and a tick helper type for progress tracking.
utils/common/src/async.helpers.ts Introduced asyncLimitSeries for executing asynchronous tasks with a concurrency limit.
utils/common/src/index.ts Added export for simple.records.
utils/common/src/simple.records.ts Introduced the Simplified class for handling records, alongside new methods for data transformation and interaction, while defining new types (Primitive, SimpleRecord, SimpleValues).

Sequence Diagram(s)

Auto-Update Schema Plugin

sequenceDiagram
  participant Admin
  participant DeploymentService
  participant RolloutPlugin
  participant FlatfileServer
  
  Admin->>DeploymentService: Deploy new code
  DeploymentService->>RolloutPlugin: Trigger rollout
  RolloutPlugin->>FlatfileServer: Retrieve current space and workbooks
  FlatfileServer-->>RolloutPlugin: Return space and workbooks
  RolloutPlugin->>FlatfileServer: Update workbooks' schemas based on new configuration
  FlatfileServer-->>RolloutPlugin: Confirm updates
  RolloutPlugin->>Admin: Notify completion
Loading

Simplified Record Handling

sequenceDiagram
  participant Developer
  participant Simplified
  participant FlatfileAPI
  
  Developer->>Simplified: Get all records by sheet ID
  Simplified->>FlatfileAPI: Fetch records
  FlatfileAPI-->>Simplified: Return records
  Simplified->>Developer: Transform records to SimpleRecord format

  Developer->>Simplified: Update records
  Simplified->>FlatfileAPI: Send update records request
  FlatfileAPI-->>Simplified: Confirm updates
  Simplified->>Developer: Notify completion
Loading

Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between adca51c and c087208.

Files ignored due to path filters (2)
  • package-lock.json is excluded by !**/package-lock.json, !**/*.json
  • plugins/rollout/package.json is excluded by !**/*.json
Files selected for processing (8)
  • .changeset/lemon-trainers-rescue.md (1 hunks)
  • plugins/rollout/README.md (1 hunks)
  • plugins/rollout/rollup.config.mjs (1 hunks)
  • plugins/rollout/src/index.ts (1 hunks)
  • utils/common/src/all.records.ts (4 hunks)
  • utils/common/src/async.helpers.ts (1 hunks)
  • utils/common/src/index.ts (1 hunks)
  • utils/common/src/simple.records.ts (1 hunks)
Files skipped from review due to trivial changes (2)
  • plugins/rollout/rollup.config.mjs
  • utils/common/src/index.ts
Additional context used
Markdownlint
plugins/rollout/README.md

20-20: Expected: h3; Actual: h4
Heading levels should only increment by one level at a time

(MD001, heading-increment)


42-42: Expected: 0 or 2; Actual: 1
Trailing spaces

(MD009, no-trailing-spaces)


47-47: Expected: 0 or 2; Actual: 1
Trailing spaces

(MD009, no-trailing-spaces)


52-52: Expected: 0 or 2; Actual: 1
Trailing spaces

(MD009, no-trailing-spaces)


6-6: Expected: 1; Actual: 2
Multiple consecutive blank lines

(MD012, no-multiple-blanks)


14-14: Expected: 1; Actual: 2
Multiple consecutive blank lines

(MD012, no-multiple-blanks)


17-17: Expected: 1; Actual: 2
Multiple consecutive blank lines

(MD012, no-multiple-blanks)


28-28: Expected: 1; Actual: 2
Multiple consecutive blank lines

(MD012, no-multiple-blanks)


3-3: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)


42-42: null
Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


47-47: null
Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


52-52: null
Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


10-10: null
Lists should be surrounded by blank lines

(MD032, blanks-around-lists)


41-41: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


46-46: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


51-51: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

Biome
utils/common/src/simple.records.ts

[error] 13-113: Avoid classes that contain only static members.

Prefer using simple functions instead of classes with only static members.

(lint/complexity/noStaticOnlyClass)

utils/common/src/all.records.ts

[error] 65-65: void is confusing inside a union type.

Unsafe fix: Use undefined instead.

(lint/suspicious/noConfusingVoidType)


[error] 67-67: void is confusing inside a union type.

Unsafe fix: Use undefined instead.

(lint/suspicious/noConfusingVoidType)


[error] 84-84: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)

Additional comments not posted (9)
.changeset/lemon-trainers-rescue.md (1)

6-6: Consider elaborating on the changeset description.

The current description is clear but adding more details about how these changes integrate with existing functionalities or any dependencies they might have could provide a better understanding for other developers and users.

plugins/rollout/README.md (2)

20-20: Adjust heading level for better structure.

Change the heading level from h4 to h3 to maintain a proper increment in heading levels.

- #### `namespace` - `string`
+ ### `namespace` - `string`
Tools
Markdownlint

20-20: Expected: h3; Actual: h4
Heading levels should only increment by one level at a time

(MD001, heading-increment)


42-42: Add surrounding blank lines for code blocks.

To adhere to Markdown best practices, ensure that code blocks are surrounded by blank lines.

+ 
```bash 
43~
npm i @flatfile/plugin-rollout
44~
48~
import { rollout } from "@flatfile/plugin-rollout";
49~
53~
listener.use(rollout({ updater: (workbooks) => {
54~
  // update workbooks here with new schema
55~
}}));
56~


Also applies to: 47-47, 52-52

<details>
<summary>Tools</summary>

<details>
<summary>Markdownlint</summary><blockquote>

42-42: Expected: 0 or 2; Actual: 1
Trailing spaces

(MD009, no-trailing-spaces)

---

42-42: null
Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

</blockquote></details>

</details>

</blockquote></details>
<details>
<summary>utils/common/src/all.records.ts (6)</summary><blockquote>

`13-13`: **Simplified return type in `getRecordsRaw`.**

The function now returns only `Array<Flatfile.Record_>`, making error handling clearer. This is a positive change as it simplifies the function's interface.

---

`104-126`: **New function `createRecords` added.**

This function correctly uses the POST method to create records and includes handling for HTTP responses. Consider improving error handling as suggested previously.

---

`131-158`: **New function `updateRecords` added.**

This function correctly uses the PUT method to update records, includes custom headers to force hooks, and specifically handles HTTP status 304. Improve error handling as previously suggested.

---

`163-181`: **New function `updateAllRecords` added.**

This function updates all records in batches using pagination and concurrency control, which is a robust approach for handling large datasets.

---

`186-204`: **New function `createAllRecords` added.**

This function efficiently handles the creation of records in batches, using pagination and concurrency control. This is a good practice for managing large datasets.

---

`206-210`: **Type definition for `TickHelper`.**

The `TickHelper` type is well-defined and appropriately used in batch operations for progress tracking.

</blockquote></details>

</blockquote></details>

</details><!-- tips_start -->

---





<details>
<summary>Tips</summary>

### Chat

There are 3 ways to chat with [CodeRabbit](https://coderabbit.ai):

- Review comments: Directly reply to a review comment made by CodeRabbit. Example:
	- `I pushed a fix in commit <commit_id>.`
	- `Generate unit testing code for this file.`
	- `Open a follow-up GitHub issue for this discussion.`
- Files and specific lines of code (under the "Files changed" tab): Tag `@coderabbitai` in a new review comment at the desired location with your query. Examples:
	- `@coderabbitai generate unit testing code for this file.`
	-	`@coderabbitai modularize this function.`
- PR comments: Tag `@coderabbitai` in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
	- `@coderabbitai generate interesting stats about this repository and render them as a table.`
	- `@coderabbitai show all the console.log statements in this repository.`
	- `@coderabbitai read src/utils.ts and generate unit testing code.`
	- `@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.`
	- `@coderabbitai help me debug CodeRabbit configuration file.`

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

### CodeRabbit Commands (invoked as PR comments)

- `@coderabbitai pause` to pause the reviews on a PR.
- `@coderabbitai resume` to resume the paused reviews.
- `@coderabbitai review` to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
- `@coderabbitai full review` to do a full review from scratch and review all the files again.
- `@coderabbitai summary` to regenerate the summary of the PR.
- `@coderabbitai resolve` resolve all the CodeRabbit review comments.
- `@coderabbitai configuration` to show the current CodeRabbit configuration for the repository.
- `@coderabbitai help` to get help.


Additionally, you can add `@coderabbitai ignore` anywhere in the PR description to prevent this PR from being reviewed.

### CodeRabbit Configration File (`.coderabbit.yaml`)

- You can programmatically configure CodeRabbit by adding a `.coderabbit.yaml` file to the root of your repository.
- Please see the [configuration documentation](https://docs.coderabbit.ai/guides/configure-coderabbit) for more information.
- If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: `# yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json`

### Documentation and Community

- Visit our [Documentation](https://coderabbit.ai/docs) for detailed information on how to use CodeRabbit.
- Join our [Discord Community](https://discord.com/invite/GsXnASn26c) to get help, request features, and share feedback.
- Follow us on [X/Twitter](https://twitter.com/coderabbitai) for updates and announcements.

</details>

<!-- tips_end -->

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

const token =
CrossEnvConfig.get('FLATFILE_BEARER_TOKEN') ||
CrossEnvConfig.get('FLATFILE_API_KEY')
const { baseUrl, token } = getCredentials()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improve error handling in getRecordsRaw.

Currently, errors from the HTTP response are only logged and then an exception is thrown with a generic message. Consider enhancing error handling by including more details from the HTTP response in the thrown exception.

-    console.log(await httpResponse?.text())
-    throw new Error(`Creating records failed.`)
+    const errorText = await httpResponse.text();
+    throw new Error(`Creating records failed: ${errorText}`);

Also applies to: 120-123

Comment on lines +212 to +222
function getCredentials() {
const baseUrl =
CrossEnvConfig.get('AGENT_INTERNAL_URL') ||
CrossEnvConfig.get('FLATFILE_API_URL') ||
'https://platform.flatfile.com/api'
const token =
CrossEnvConfig.get('FLATFILE_BEARER_TOKEN') ||
CrossEnvConfig.get('FLATFILE_API_KEY')

return { baseUrl, token }
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function getCredentials reviewed.

This function effectively retrieves credentials from environment variables with appropriate fallbacks. Consider adding error checking to ensure that necessary credentials are not missing.

+ if (!token) {
+   throw new Error('API token is missing');
+ }
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function getCredentials() {
const baseUrl =
CrossEnvConfig.get('AGENT_INTERNAL_URL') ||
CrossEnvConfig.get('FLATFILE_API_URL') ||
'https://platform.flatfile.com/api'
const token =
CrossEnvConfig.get('FLATFILE_BEARER_TOKEN') ||
CrossEnvConfig.get('FLATFILE_API_KEY')
return { baseUrl, token }
}
function getCredentials() {
const baseUrl =
CrossEnvConfig.get('AGENT_INTERNAL_URL') ||
CrossEnvConfig.get('FLATFILE_API_URL') ||
'https://platform.flatfile.com/api'
const token =
CrossEnvConfig.get('FLATFILE_BEARER_TOKEN') ||
CrossEnvConfig.get('FLATFILE_API_KEY')
if (!token) {
throw new Error('API token is missing');
}
return { baseUrl, token }
}

Comment on lines +1 to +12
export async function asyncLimitSeries<T>(
limit: number,
fn: (i: number) => Promise<T>
): Promise<T[]> {
let response = []

for (let i = 0; i < limit; i++) {
const res = await fn(i)
response.push(res)
}

return response
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding error handling in asyncLimitSeries.

The function currently does not handle errors that might occur during the execution of fn(i). Adding try-catch blocks could improve robustness.

  for (let i = 0; i < limit; i++) {
+   try {
+     const res = await fn(i)
+     response.push(res)
+   } catch (e) {
+     console.error(`Error processing task ${i}:`, e)
+     // Handle error appropriately, e.g., continue, break, or modify response
+   }
-   const res = await fn(i)
-   response.push(res)
  }
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export async function asyncLimitSeries<T>(
limit: number,
fn: (i: number) => Promise<T>
): Promise<T[]> {
let response = []
for (let i = 0; i < limit; i++) {
const res = await fn(i)
response.push(res)
}
return response
export async function asyncLimitSeries<T>(
limit: number,
fn: (i: number) => Promise<T>
): Promise<T[]> {
let response = []
for (let i = 0; i < limit; i++) {
try {
const res = await fn(i)
response.push(res)
} catch (e) {
console.error(`Error processing task ${i}:`, e)
// Handle error appropriately, e.g., continue, break, or modify response
}
}
return response
}

Comment on lines +13 to +113
export class Simplified {
/**
* Return all records for a sheet
*
* @param sheetId
* @param options
* @param tick
*/
static async getAllRecords(
sheetId: string,
options: Flatfile.records.GetRecordsRequest = {},
tick?: TickHelper
): Promise<SimpleRecord[]> {
const recordCount = await getSheetLength(sheetId)
const pageCount = Math.ceil(recordCount / PAGE_SIZE)

const recordPages = await asyncLimitSeries(pageCount, async (i: number) => {
await tick?.((i + 1) / pageCount, i + 1, pageCount).catch(console.log)
const res = await getRecordsRaw(sheetId, {
...options,
pageNumber: i + 1,
})
return res.map(Simplified.toSimpleRecord)
})
return recordPages.flat(1)
}

static async findRecordsLimit(
sheetId: string,
options: Flatfile.records.GetRecordsRequest,
limit = 100
) {
const records = await getRecordsRaw(sheetId, {
...options,
pageSize: limit,
pageNumber: 1,
})
if (Array.isArray(records)) {
return records.map(Simplified.toSimpleRecord)
}
return []
}

/**
* { foo: bar } => { foo : {value: bar}}
* @param obj
*/
static toRecordValues(obj: SimpleRecord): Flatfile.RecordData {
return Object.fromEntries(
Object.entries(obj)
.filter(([key]) => key !== '_id' && key !== '_metadata')
.map(([key, value]) => [key, { value: value as Primitive }])
)
}

static toStandardRecord(obj: SimpleRecord) {
return {
id: obj._id as string,
metadata: obj._metadata as any,
values: Simplified.toRecordValues(obj),
}
}

/**
*
* @param r
*/
static toSimpleRecord(r: Flatfile.Record_): SimpleRecord {
const obj = Object.fromEntries(
Object.entries(r.values).map(
([key, value]) => [key, value.value] as [string, any]
)
)
obj.id = r.id
return obj as SimpleRecord
}

static updateAllRecords(
sheetId: string,
records: SimpleRecord[],
tick?: TickHelper
): Promise<void> {
return updateAllRecords(
sheetId,
records.map(Simplified.toStandardRecord),
tick
)
}

static async createAllRecords(
sheetId: string,
records: SimpleRecord[],
tick?: TickHelper
): Promise<void> {
await createAllRecords(
sheetId,
records.map(Simplified.toRecordValues),
tick
)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider refactoring the Simplified class to use functions instead of static methods.

Using a class with only static members can be less efficient and less modular than using standalone functions. This also aligns with the static analysis recommendation.

- export class Simplified {
-   static getAllRecords(sheetId: string, options: Flatfile.records.GetRecordsRequest = {}, tick?: TickHelper): Promise<SimpleRecord[]> {
+ export function getAllRecords(sheetId: string, options: Flatfile.records.GetRecordsRequest = {}, tick?: TickHelper): Promise<SimpleRecord[]> {
...
-   static updateAllRecords(sheetId: string, records: SimpleRecord[], tick?: TickHelper): Promise<void> {
+ export function updateAllRecords(sheetId: string, records: SimpleRecord[], tick?: TickHelper): Promise<void> {
...
- }

Committable suggestion was skipped due to low confidence.

Tools
Biome

[error] 13-113: Avoid classes that contain only static members.

Prefer using simple functions instead of classes with only static members.

(lint/complexity/noStaticOnlyClass)

Comment on lines +33 to +158
export function rollout(config: {
namespace: string
dev: boolean
updater: (
space: Flatfile.Space,
workbooks: Flatfile.Workbook[]
) => Promise<undefined | Flatfile.Workbook[]>
}) {
async function prepareUpdate(dev: boolean = false) {
const { data: spaces } = await api.spaces.list({
archived: false,
namespace: config.namespace.split(':')[1],
pageSize: 1000,
})

console.log(`Fetched ${spaces.length} spaces`)
const spacesWithSecrets = await asyncMap(
spaces,
async (space) => {
const { data: secrets } = await api.secrets.list({ spaceId: space.id })
return { space, secrets }
},
10
)
console.log(`Hydrated secrets for ${spacesWithSecrets.length} spaces`)

const prodUpdate = spacesWithSecrets.filter(({ secrets }) => {
return secrets.some(
(s) => s.name === 'FF_AUTO_UPDATE' && s.value === 'true'
)
})

const devUpdate = spacesWithSecrets.filter(({ secrets }) => {
return secrets.some(
(s) => s.name === 'FF_AUTO_UPDATE_DEV' && s.value === 'true'
)
})

const updatedList = await asyncMap(
dev ? devUpdate : prodUpdate,
({ space }) => {
return triggerUpdateJobForSpace(space)
}
)

// loop through each and determine if it needs to be updated by checking secrets
console.log(`Triggered schema for ${updatedList.length} spaces`)
}

async function triggerUpdateJobForSpace(space: Flatfile.Space) {
await api.jobs.create({
type: 'space',
source: space.id,
operation: 'auto-update',
trigger: 'immediate',
managed: true,
mode: 'foreground',
environmentId: space.environmentId,
})
}

const cb = (listener: FlatfileListener) => {
const isLambda = !!process.env.LAMBDA_TASK_ROOT

if (!isLambda && config.dev) {
prepareUpdate(true).then(() => {
console.log(
'running local dev refresh based update of spaces, suppress this with dev=false in the autoUpdate plugin config'
)
})
}

const triggerHooks = async (sheetId: string) => {
// loop through all records and increment metadata somehow triggering a new version
// this will trigger the auto update to run
const uuid = randomUUID()
console.log('Hooks updating =>', sheetId)

const records = await Simplified.getAllRecords(sheetId)
const updatedRecords = records.map((record) => {
record.metadata = {
...((record.metadata as any) || {}),
_autoUpdateKey: uuid,
}
return record
})

await Simplified.updateAllRecords(sheetId, updatedRecords)
console.log('Hooks updated =>', sheetId)
}

listener.use(
jobHandler('space:auto-update', async (e) => {
const { data: space } = await api.spaces.get(e.context.spaceId)
// which workbook to update
const { data: workbooks } = await api.workbooks.list({
spaceId: space.id,
})
const updated = await config.updater(
space,
workbooks.filter((wb) => !wb.name.includes('[file]'))
)
if (updated?.length) {
await asyncMap(updated, async (wb) => {
console.log('workbook schema updated, triggering hooks', wb.id)
await asyncMap(
wb.sheets,
(sheet) => {
return triggerHooks(sheet.id)
},
10
)
})
}
return { info: 'Successfully updated schema' }
})
)
}

cb.root = (listener: FlatfileListener) => {
listener.on('agent:created', () => prepareUpdate(false))
listener.on('agent:updated', () => prepareUpdate(false))
}

return cb
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enhance error handling and modularize the rollout function.

The rollout function is complex and handles multiple operations. Consider breaking it down into smaller, more manageable functions. Also, add error handling to improve robustness, especially for API calls.

+ try {
  async function prepareUpdate(dev: boolean = false) {
    const { data: spaces } = await api.spaces.list({
...
- }
+ } catch (error) {
+   console.error('Failed to prepare update:', error)
+ }

Committable suggestion was skipped due to low confidence.

@dboskovic dboskovic merged commit 0a4afa0 into main Jun 28, 2024
@dboskovic dboskovic deleted the rollout branch June 28, 2024 19:57
@coderabbitai coderabbitai bot mentioned this pull request Sep 23, 2024
@coderabbitai coderabbitai bot mentioned this pull request Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant