From bb4ad73ec40c8bd01960510312d7ec55396bca43 Mon Sep 17 00:00:00 2001 From: Kevin Gurney Date: Wed, 16 Aug 2023 15:10:34 -0400 Subject: [PATCH 01/13] Update introductory information in `README.md`. Co-authored-by: Sarah Gilmore --- matlab/README.md | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/matlab/README.md b/matlab/README.md index 13b27274d8b..54878a26ef6 100644 --- a/matlab/README.md +++ b/matlab/README.md @@ -27,17 +27,16 @@ This is a very early stage MATLAB interface to the Apache Arrow C++ libraries. Currently, the MATLAB interface supports: -1. Creating a subset of Arrow `Array` types (e.g. numeric and boolean) from MATLAB data -2. Reading and writing numeric types from/to Feather v1 files. +1. Creating a subset of Arrow `Array` types from MATLAB data (see table below) +2. Converting MATLAB `table`s to `arrow.tabular.RecordBatch`s +3. Reading and writing Feather v1 files Supported `arrow.array.Array` types are included in the table below. -**NOTE**: All Arrow `Array` classes are part of the `arrow.array` package (e.g. `arrow.array.Float64Array`). +**NOTE**: All Arrow `Array` classes listed below are part of the `arrow.array` package (e.g. `arrow.array.Float64Array`). | MATLAB Array Type | Arrow Array Type | | ----------------- | ---------------- | -| `single` | `Float32Array` | -| `double` | `Float64Array` | | `uint8` | `UInt8Array` | | `uint16` | `UInt16Array` | | `uint32` | `UInt32Array` | @@ -46,7 +45,11 @@ Supported `arrow.array.Array` types are included in the table below. | `int16` | `Int16Array` | | `int32` | `Int32Array` | | `int64` | `Int64Array` | +| `single` | `Float32Array` | +| `double` | `Float64Array` | | `logical` | `BooleanArray` | +| `string` | `StringArray` | +| `datetime` | `TimestampArray` | ## Prerequisites From 49b3185fcc292b91b69bd79c21ff055e461bd7a4 Mon Sep 17 00:00:00 2001 From: Kevin Gurney Date: Wed, 16 Aug 2023 15:17:10 -0400 Subject: [PATCH 02/13] Update Array examples to use `arrow.array` construction function. --- matlab/README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/matlab/README.md b/matlab/README.md index 54878a26ef6..970646434ed 100644 --- a/matlab/README.md +++ b/matlab/README.md @@ -139,7 +139,7 @@ matlabArray = 1 2 3 ->> arrowArray = arrow.array.Float64Array(matlabArray) +>> arrowArray = arrow.array(matlabArray) arrowArray = @@ -153,7 +153,7 @@ arrowArray = #### Create a MATLAB `logical` array from an Arrow `BooleanArray` ```matlab ->> arrowArray = arrow.array.BooleanArray([true, false, true]) +>> arrowArray = arrow.array([true, false, true]) arrowArray = @@ -195,7 +195,7 @@ validElements = 1 0 1 0 1 % Specify which values are Null/Valid by supplying a logical validity "mask" ->> arrowArray = arrow.array.Int8Array(matlabArray, Valid=validElements) +>> arrowArray = arrow.array(matlabArray, Valid=validElements) arrowArray = From 7703e5162983e01b9e287461e292c8368a60c325 Mon Sep 17 00:00:00 2001 From: Kevin Gurney Date: Wed, 16 Aug 2023 15:50:10 -0400 Subject: [PATCH 03/13] Add examples of `RecordBatch` usage. --- matlab/README.md | 163 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 163 insertions(+) diff --git a/matlab/README.md b/matlab/README.md index 970646434ed..28e781748a2 100644 --- a/matlab/README.md +++ b/matlab/README.md @@ -208,6 +208,169 @@ arrowArray = ] ``` +### Arrow `RecordBatch` class + +#### Create an Arrow `RecordBatch` from a MATLAB `table` + +```matlab +>> matlabTable = table(["A"; "B"; "C"], [1; 2; 3], [true; false; true]) + +matlabTable = + + 3x3 table + + Var1 Var2 Var3 + ____ ____ _____ + + "A" 1 true + "B" 2 false + "C" 3 true + +>> arrowRecordBatch = arrow.recordbatch(matlabTable) + +arrowRecordBatch = + +Var1: [ + "A", + "B", + "C" + ] +Var2: [ + 1, + 2, + 3 + ] +Var3: [ + true, + false, + true + ] +``` + +#### Create a MATLAB `table` from an Arrow `RecordBatch` + +```matlab +>> arrowRecordBatch + +arrowRecordBatch = + +Var1: [ + "A", + "B", + "C" + ] +Var2: [ + 1, + 2, + 3 + ] +Var3: [ + true, + false, + true + ] + +>> matlabTable = table(arrowRecordBatch) + +matlabTable = + + 3x3 table + + Var1 Var2 Var3 + ____ ____ _____ + + "A" 1 true + "B" 2 false + "C" 3 true +``` + +#### Create an Arrow `RecordBatch` from multiple Arrow `Array`s + + +```matlab +>> stringArray = arrow.array(["A", "B", "C"]) + +stringArray = + +[ + "A", + "B", + "C" +] +>> timestampArray = arrow.array([datetime(1997, 01, 01), datetime(1998, 01, 01), datetime(1999, 01, 01)]) + +timestampArray = + +[ + 1997-01-01 00:00:00.000000, + 1998-01-01 00:00:00.000000, + 1999-01-01 00:00:00.000000 +] +>> booleanArray = arrow.array([true, false, true]) + +booleanArray = + +[ + true, + false, + true +] + +>> arrowRecordBatch = arrow.tabular.RecordBatch.fromArrays(stringArray, timestampArray, booleanArray) + +arrowRecordBatch = + +Column1: [ + "A", + "B", + "C" + ] +Column2: [ + 1997-01-01 00:00:00.000000, + 1998-01-01 00:00:00.000000, + 1999-01-01 00:00:00.000000 + ] +Column3: [ + true, + false, + true + ] +``` + +#### Extract a column from a `RecordBatch` by index + +```matlab +>> arrowRecordBatch = arrow.tabular.RecordBatch.fromArrays(stringArray, timestampArray, booleanArray) + +arrowRecordBatch = + +Column1: [ + "A", + "B", + "C" + ] +Column2: [ + 1997-01-01 00:00:00.000000, + 1998-01-01 00:00:00.000000, + 1999-01-01 00:00:00.000000 + ] +Column3: [ + true, + false, + true + ] + +>> timestampArray = arrowRecordBatch.column(2) + +timestampArray = + +[ + 1997-01-01 00:00:00.000000, + 1998-01-01 00:00:00.000000, + 1999-01-01 00:00:00.000000 +] +``` + ### Feather V1 #### Write a MATLAB table to a Feather v1 file From 7f88ed435e9bdfc5e05636a865de5c5aee6efd06 Mon Sep 17 00:00:00 2001 From: Kevin Gurney Date: Wed, 16 Aug 2023 15:51:58 -0400 Subject: [PATCH 04/13] Add newlines between arrays. --- matlab/README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/matlab/README.md b/matlab/README.md index 28e781748a2..b21651afc2b 100644 --- a/matlab/README.md +++ b/matlab/README.md @@ -297,6 +297,7 @@ stringArray = "B", "C" ] + >> timestampArray = arrow.array([datetime(1997, 01, 01), datetime(1998, 01, 01), datetime(1999, 01, 01)]) timestampArray = @@ -306,6 +307,7 @@ timestampArray = 1998-01-01 00:00:00.000000, 1999-01-01 00:00:00.000000 ] + >> booleanArray = arrow.array([true, false, true]) booleanArray = From 812fec2c1a55c64cbd5e41b7022eda9c02b3270c Mon Sep 17 00:00:00 2001 From: Kevin Gurney Date: Wed, 16 Aug 2023 16:07:44 -0400 Subject: [PATCH 05/13] Add usage examples for `Type` classes. --- matlab/README.md | 56 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 56 insertions(+) diff --git a/matlab/README.md b/matlab/README.md index b21651afc2b..dd8f11c1d9b 100644 --- a/matlab/README.md +++ b/matlab/README.md @@ -373,6 +373,62 @@ timestampArray = ] ``` +### Arrow `Type` classes (i.e. `arrow.type.`) + +#### Create an Arrow `Int8Type` object + +```matlab +>> type = arrow.int8() + +type = + + Int8Type with properties: + + ID: Int8 +``` + +#### Create an Arrow `TimestampType` object with a specific `TimeUnit` and `TimeZone` + +```matlab +>> type = arrow.timestamp(TimeUnit="Second", TimeZone="Asia/Kolkata") + +type = + + TimestampType with properties: + + ID: Timestamp + TimeUnit: Second + TimeZone: "Asia/Kolkata" +``` + + +#### Get the type enumeration `ID` for an Arrow `Type` object +```matlab +>> type.ID + +ans = + + ID enumeration + + Timestamp + +>> type = arrow.string() + +type = + + StringType with properties: + + ID: String + +>> type.ID + +ans = + + ID enumeration + + String +``` + ### Feather V1 #### Write a MATLAB table to a Feather v1 file From debe2ecd67ff2acf7a786b2535c88a5f860babd0 Mon Sep 17 00:00:00 2001 From: Kevin Gurney Date: Wed, 16 Aug 2023 16:14:24 -0400 Subject: [PATCH 06/13] Add usage examples for `arrow.type.Field`. --- matlab/README.md | 51 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) diff --git a/matlab/README.md b/matlab/README.md index dd8f11c1d9b..51bb96831fc 100644 --- a/matlab/README.md +++ b/matlab/README.md @@ -429,6 +429,57 @@ ans = String ``` +### Arrow `Field` class + +#### Create an Arrow `Field` with type `Int8Type` + +```matlab +>> field = arrow.field("Number", arrow.int8()) + +field = + +Number: int8 + +>> field.Name + +ans = + + "Number" + +>> field.Type + +ans = + + Int8Type with properties: + + ID: Int8 + +``` + +#### Create an Arrow `Field` with type `StringType` + +```matlab +>> field = arrow.field("Letter", arrow.string()) + +field = + +Letter: string + +>> field.Name + +ans = + + "Letter" + +>> field.Type + +ans = + + StringType with properties: + + ID: String +``` + ### Feather V1 #### Write a MATLAB table to a Feather v1 file From 8189871206578828e5599a85a461af4b2ecc6983 Mon Sep 17 00:00:00 2001 From: Kevin Gurney Date: Wed, 16 Aug 2023 16:20:49 -0400 Subject: [PATCH 07/13] Add usage examples for `arrow.tabular.Schema`. --- matlab/README.md | 64 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) diff --git a/matlab/README.md b/matlab/README.md index 51bb96831fc..2048a4d7eaf 100644 --- a/matlab/README.md +++ b/matlab/README.md @@ -480,6 +480,70 @@ ans = ID: String ``` +### Arrow `Schema` class + +#### Create an Arrow `Schema` from multiple Arrow `Field`s + +```matlab +>> letter = arrow.field("Letter", arrow.string()) + +letter = + +Letter: string + +>> number = arrow.field("Number", arrow.int8()) + +number = + +Number: int8 + +>> schema = arrow.schema([letter, number]) + +schema = + +Letter: string +Number: int8 +``` + +#### Get the `Schema` of an Arrow `RecordBatch` + +```matlab +>> matlabTable = table(["A"; "B"; "C"], [1; 2; 3], VariableNames=["Letter", "Number"]) + +matlabTable = + + 3x2 table + + Letter Number + ______ ______ + + "A" 1 + "B" 2 + "C" 3 + +>> arrowRecordBatch = arrow.recordbatch(matlabTable) + +arrowRecordBatch = + +Letter: [ + "A", + "B", + "C" + ] +Number: [ + 1, + 2, + 3 + ] + +>> arrowSchema = arrowRecordBatch.Schema + +arrowSchema = + +Letter: string +Number: double +``` + ### Feather V1 #### Write a MATLAB table to a Feather v1 file From 9606332a515623c0f859cf7bdf8d2e8d66302a8c Mon Sep 17 00:00:00 2001 From: Kevin Gurney Date: Wed, 16 Aug 2023 16:25:06 -0400 Subject: [PATCH 08/13] Add usage examples for extracting `Field`s from `Schema`s by index and by name. --- matlab/README.md | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/matlab/README.md b/matlab/README.md index 2048a4d7eaf..6cf2a5a1d6b 100644 --- a/matlab/README.md +++ b/matlab/README.md @@ -480,6 +480,42 @@ ans = ID: String ``` +#### Extract an Arrow `Field` from an Arrow `Schema` by index + +```matlab +>> arrowSchema + +arrowSchema = + +Letter: string +Number: double + +% Specify the field to extract by its index (i.e. 2) +>> field = arrowSchema.field(2) + +field = + +Number: double +``` + +#### Extract an Arrow `Field` from an Arrow `Schema` by name + +```matlab +>> arrowSchema + +arrowSchema = + +Letter: string +Number: double + +% Specify the field to extract by its name (i.e. "Letter") +>> field = arrowSchema.field("Letter") + +field = + +Letter: string +``` + ### Arrow `Schema` class #### Create an Arrow `Schema` from multiple Arrow `Field`s From 949f7256b67bc260c11d2452f7188de14032584a Mon Sep 17 00:00:00 2001 From: Kevin Gurney Date: Wed, 16 Aug 2023 16:35:07 -0400 Subject: [PATCH 09/13] Update Feather v1 reading and writing examples. --- matlab/README.md | 35 ++++++++++++++++++++++++++++++----- 1 file changed, 30 insertions(+), 5 deletions(-) diff --git a/matlab/README.md b/matlab/README.md index 6cf2a5a1d6b..fa36f7e3eac 100644 --- a/matlab/README.md +++ b/matlab/README.md @@ -585,15 +585,40 @@ Number: double #### Write a MATLAB table to a Feather v1 file ``` matlab ->> t = array2table(rand(10, 10)); ->> filename = 'table.feather'; ->> featherwrite(filename,t); +>> t = table(["A"; "B"; "C"], [1; 2; 3], [true; false; true]) + +t = + + 3×3 table + + Var1 Var2 Var3 + ____ ____ _____ + + "A" 1 true + "B" 2 false + "C" 3 true + +>> filename = "table.feather"; + +>> featherwrite(filename, t) ``` #### Read a Feather v1 file into a MATLAB table ``` matlab ->> filename = 'table.feather'; ->> t = featherread(filename); +>> filename = "table.feather"; + +>> t = featherread(filename) + +t = + + 3×3 table + + Var1 Var2 Var3 + ____ ____ _____ + + "A" 1 true + "B" 2 false + "C" 3 true ``` From 6186320e4b8e5146e7f498358d5f0b21ddd84ca6 Mon Sep 17 00:00:00 2001 From: Kevin Gurney Date: Wed, 16 Aug 2023 16:39:29 -0400 Subject: [PATCH 10/13] Modified wording of type conversion support in Status section. --- matlab/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/matlab/README.md b/matlab/README.md index fa36f7e3eac..c7988a89f3c 100644 --- a/matlab/README.md +++ b/matlab/README.md @@ -27,8 +27,8 @@ This is a very early stage MATLAB interface to the Apache Arrow C++ libraries. Currently, the MATLAB interface supports: -1. Creating a subset of Arrow `Array` types from MATLAB data (see table below) -2. Converting MATLAB `table`s to `arrow.tabular.RecordBatch`s +1. Converting between a subset of Arrow `Array` types and MATLAB array types (see table below) +2. Converting between MATLAB `table`s and `arrow.tabular.RecordBatch`s 3. Reading and writing Feather v1 files Supported `arrow.array.Array` types are included in the table below. From 07f748b6838a7afb6d069bc315609b6cee1ec2f8 Mon Sep 17 00:00:00 2001 From: Kevin Gurney Date: Wed, 16 Aug 2023 16:47:34 -0400 Subject: [PATCH 11/13] Add mention of support for creating Arrow `Field`s, `Schema`s, and `Type`s to Status section of `README.md`. --- matlab/README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/matlab/README.md b/matlab/README.md index c7988a89f3c..9872f4ff6c6 100644 --- a/matlab/README.md +++ b/matlab/README.md @@ -29,7 +29,8 @@ Currently, the MATLAB interface supports: 1. Converting between a subset of Arrow `Array` types and MATLAB array types (see table below) 2. Converting between MATLAB `table`s and `arrow.tabular.RecordBatch`s -3. Reading and writing Feather v1 files +3. Creating Arrow `Field`s, `Schema`s, and `Type`s +4. Reading and writing Feather v1 files Supported `arrow.array.Array` types are included in the table below. From 96f6977a742c74912e900b67a3a42777e960021d Mon Sep 17 00:00:00 2001 From: Kevin Gurney Date: Wed, 16 Aug 2023 17:06:19 -0400 Subject: [PATCH 12/13] Capitalize "V1" in "Feather V1". --- matlab/README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/matlab/README.md b/matlab/README.md index 9872f4ff6c6..5131e4976e8 100644 --- a/matlab/README.md +++ b/matlab/README.md @@ -30,7 +30,7 @@ Currently, the MATLAB interface supports: 1. Converting between a subset of Arrow `Array` types and MATLAB array types (see table below) 2. Converting between MATLAB `table`s and `arrow.tabular.RecordBatch`s 3. Creating Arrow `Field`s, `Schema`s, and `Type`s -4. Reading and writing Feather v1 files +4. Reading and writing Feather V1 files Supported `arrow.array.Array` types are included in the table below. @@ -583,7 +583,7 @@ Number: double ### Feather V1 -#### Write a MATLAB table to a Feather v1 file +#### Write a MATLAB table to a Feather V1 file ``` matlab >> t = table(["A"; "B"; "C"], [1; 2; 3], [true; false; true]) @@ -604,7 +604,7 @@ t = >> featherwrite(filename, t) ``` -#### Read a Feather v1 file into a MATLAB table +#### Read a Feather V1 file into a MATLAB table ``` matlab >> filename = "table.feather"; From d870c77d36554cb84bea03349fab61c1a6af081e Mon Sep 17 00:00:00 2001 From: Kevin Gurney Date: Thu, 17 Aug 2023 10:17:42 -0400 Subject: [PATCH 13/13] Add newline between section header and code block. Co-authored-by: Sutou Kouhei --- matlab/README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/matlab/README.md b/matlab/README.md index 5131e4976e8..3bd445fe061 100644 --- a/matlab/README.md +++ b/matlab/README.md @@ -404,6 +404,7 @@ type = #### Get the type enumeration `ID` for an Arrow `Type` object + ```matlab >> type.ID