From 583c2ff11346ae49ceb70b2bfaf3e6cc48bd53ec Mon Sep 17 00:00:00 2001 From: RunningDB Date: Wed, 25 Dec 2024 15:28:58 +0800 Subject: [PATCH] [docs] Fix typo in docs/content/pk-table&append-table --- docs/content/append-table/bucketed.md | 2 +- docs/content/append-table/query-performance.md | 4 ++-- docs/content/primary-key-table/compaction.md | 2 +- docs/content/primary-key-table/overview.md | 2 +- docs/content/primary-key-table/query-performance.md | 2 +- 5 files changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/content/append-table/bucketed.md b/docs/content/append-table/bucketed.md index 1a2828901acc..3904075f6975 100644 --- a/docs/content/append-table/bucketed.md +++ b/docs/content/append-table/bucketed.md @@ -196,4 +196,4 @@ The `spark.sql.sources.v2.bucketing.enabled` config is used to enable bucketing Spark will recognize the specific distribution reported by a V2 data source through SupportsReportPartitioning, and will try to avoid shuffle if necessary. -The costly join shuffle will be avoided if two tables have same bucketing strategy and same number of buckets. +The costly join shuffle will be avoided if two tables have the same bucketing strategy and same number of buckets. diff --git a/docs/content/append-table/query-performance.md b/docs/content/append-table/query-performance.md index 7ec745468ef5..d150c95bbc45 100644 --- a/docs/content/append-table/query-performance.md +++ b/docs/content/append-table/query-performance.md @@ -35,7 +35,7 @@ filtering, if the filtering effect is good, the query would have been minutes of milliseconds to complete the execution. Often the data distribution is not always effective filtering, so if we can sort the data by the field in `WHERE` condition? -You can take a look to [Flink COMPACT Action]({{< ref "maintenance/dedicated-compaction#sort-compact" >}}) or +You can take a look at [Flink COMPACT Action]({{< ref "maintenance/dedicated-compaction#sort-compact" >}}) or [Flink COMPACT Procedure]({{< ref "flink/procedures" >}}) or [Spark COMPACT Procedure]({{< ref "spark/procedures" >}}). ## Data Skipping By File Index @@ -54,7 +54,7 @@ file is too small, it will be stored directly in the manifest, otherwise in the corresponds to an index file, which has a separate file definition and can contain different types of indexes with multiple columns. -Different file index may be efficient in different scenario. For example bloom filter may speed up query in point lookup +Different file indexes may be efficient in different scenarios. For example bloom filter may speed up query in point lookup scenario. Using a bitmap may consume more space but can result in greater accuracy. `Bloom Filter`: diff --git a/docs/content/primary-key-table/compaction.md b/docs/content/primary-key-table/compaction.md index bee8c16e46e9..208a14e5ad8c 100644 --- a/docs/content/primary-key-table/compaction.md +++ b/docs/content/primary-key-table/compaction.md @@ -89,7 +89,7 @@ Paimon also provides a configuration that allows for regular execution of Full C 1. 'compaction.optimization-interval': Implying how often to perform an optimization full compaction, this configuration is used to ensure the query timeliness of the read-optimized system table. -2. 'full-compaction.delta-commits': Full compaction will be constantly triggered after delta commits. its disadvantage +2. 'full-compaction.delta-commits': Full compaction will be constantly triggered after delta commits. Its disadvantage is that it can only perform compaction synchronously, which will affect writing efficiency. ## Compaction Options diff --git a/docs/content/primary-key-table/overview.md b/docs/content/primary-key-table/overview.md index 552d60eff6de..d99a9ff683b0 100644 --- a/docs/content/primary-key-table/overview.md +++ b/docs/content/primary-key-table/overview.md @@ -56,6 +56,6 @@ Records within a data file are sorted by their primary keys. Within a sorted run {{< img src="/img/sorted-runs.png">}} -As you can see, different sorted runs may have overlapping primary key ranges, and may even contain the same primary key. When querying the LSM tree, all sorted runs must be combined and all records with the same primary key must be merged according to the user-specified [merge engine]({{< ref "primary-key-table/merge-engine/overview" >}}) and the timestamp of each record. +As you can see, different sorted runs may have overlapped primary key ranges, and may even contain the same primary key. When querying the LSM tree, all sorted runs must be combined and all records with the same primary key must be merged according to the user-specified [merge engine]({{< ref "primary-key-table/merge-engine/overview" >}}) and the timestamp of each record. New records written into the LSM tree will be first buffered in memory. When the memory buffer is full, all records in memory will be sorted and flushed to disk. A new sorted run is now created. diff --git a/docs/content/primary-key-table/query-performance.md b/docs/content/primary-key-table/query-performance.md index dea4899360ab..c32f1c5e662f 100644 --- a/docs/content/primary-key-table/query-performance.md +++ b/docs/content/primary-key-table/query-performance.md @@ -34,7 +34,7 @@ For Merge On Read table, the most important thing you should pay attention to is the concurrency of reading data. For MOW (Deletion Vectors) or COW table or [Read Optimized]({{< ref "concepts/system-tables#read-optimized-table" >}}) table, -There is no limit to the concurrency of reading data, and they can also utilize some filtering conditions for non-primary-key columns. +there is no limit to the concurrency of reading data, and they can also utilize some filtering conditions for non-primary-key columns. ## Data Skipping By Primary Key Filter