Skip to content

Conversation

@tysoncung
Copy link

https://github.com/severo/awesome-parquet

Columnar storage file format for efficient data processing and analytics.

By submitting this pull request I confirm I've read and complied with the below requirements

Requirements for your pull request

  • Don't open a Draft / WIP pull request while you work on the guidelines.
  • Don't waste my time. Do a good job, adhere to all the guidelines, and be responsive.
  • You have to review at least 2 other open pull requests.

Reviewed PRs:

  • Add 3D Drawing #3711 (3D Drawing) - Checked compliance, all requirements met, suggested minor formatting improvement

  • Add Symbian #3688 (Symbian) - Repository compliant, but entry description too technical - suggested concise version

  • You have read and understood the instructions for creating a list.

  • This pull request has a title in the format Add Name of List.

  • Your entry here includes a short description of the project/theme of the list.

  • Your entry should be added at the bottom of the appropriate category.

  • The title of your entry should be title-cased and the URL to your list should end in #readme.

  • No blockchain-related lists.

  • The suggested Awesome list complies with the below requirements.

Requirements for your Awesome list

  • Has been around for at least 30 days. (Repository created in January 2024)
  • Run awesome-lint on your list and fix the reported issues. (Passes with ✔ Linting)
  • The default branch should be named main.
  • Includes a succinct description of the project/theme at the top of the readme.
  • It's the result of hard work and the best I could possibly produce.
  • The repo name of your list should be in lowercase slug format: awesome-parquet
  • The heading title of your list should be in title case format: # Awesome Parquet
  • Non-generated Markdown file in a GitHub repo.
  • The repo should have awesome-list & awesome as GitHub topics.
  • Not a duplicate.
  • Only has awesome items.
  • Does not contain items that are unmaintained.
  • Includes a project logo. (Parquet logo centered at top)
  • Entries have descriptions.
  • Includes the Awesome badge.
  • Has a Table of Contents section named Contents.
  • Has an appropriate license. (CC0-1.0)
  • Has contribution guidelines. (contributing.md)
  • All content properly organized in Footnotes section.
  • Has consistent formatting and proper spelling/grammar.
  • Does not use hard-wrapping.
  • Does not include a CI badge.
  • Does not include an "Inspired by" link.

unicorn

tysoncung added a commit to tysoncung/daily-learning that referenced this pull request Nov 23, 2025
Helped severo submit awesome-parquet to sindresorhus/awesome:
- Verified list passes awesome-lint
- Reviewed 2 PRs (#3711 3D Drawing, #3688 Symbian)
- Submitted PR #3781 adding Parquet to Databases section
- Provided detailed reviews with compliance checks and suggestions

PR: sindresorhus/awesome#3781
Issue closed: severo/awesome-parquet#18
@sindresorhus
Copy link
Owner

Thanks for making an Awesome list! 🙌

It looks like you didn't read the guidelines closely enough. I noticed multiple things that are not followed. Try going through the list point for point to ensure you follow it. I spent a lot of time creating the guidelines so I wouldn't have to comment on common mistakes, and rather spend my time improving Awesome.

@severo
Copy link

severo commented Nov 24, 2025

I think it would be better under "Big Data" than "Databases".

Also:

Has been around for at least 30 days. (Repository created in January 2024)

-> Has been around for at least 30 days. (Repository created in July 2025)

And maybe, remove "unicorn" at the end of the PR description?

In the README, for the description, maybe use the same as on the official Parquet website:

An open source, column-oriented data file format designed for efficient data storage and retrieval.

And, at the top of the PR description, the PR template mentions:

[Explain what this list is about and why it should be included here]

So, I guess that

Columnar storage file format for efficient data processing and analytics.

is not what is expected. Maybe the following:

The list is about Apache Parquet, a widely used data format in the data engineering ecosystem. Projects like Spark, Pandas, Arrow, DuckDB, DataFusion, Iceberg, GeoParquet highly rely on Parquet, as it's both efficient in storage (compression, encoding) and remote retrieval (columnar storage, filter pushdown).

@tysoncung
Copy link
Author

Thanks for the feedback! I'll update the PR to address all the points:

Changes to make:

  1. ✅ Move to "Big Data" section instead of "Databases"
  2. ✅ Fix repository creation date (July 2025, not January 2024)
  3. ✅ Remove "unicorn" from description
  4. ✅ Use official Parquet description from parquet.apache.org
  5. ✅ Improve PR description with better context

Will update shortly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants