-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Open
Labels
Description
What needs to happen?
Beam portable schemas include primitive and more complex types (represented as logical types). Some of these types are supported in the Python SDK:
beam/sdks/python/apache_beam/typehints/schemas.py
Lines 23 to 41 in 99202b2
| Python Schema | |
| np.int8 <-----> BYTE | |
| np.int16 <-----> INT16 | |
| np.int32 <-----> INT32 | |
| np.int64 <-----> INT64 | |
| int ------> INT64 | |
| np.float32 <-----> FLOAT | |
| np.float64 <-----> DOUBLE | |
| float ------> DOUBLE | |
| bool <-----> BOOLEAN | |
| str <-----> STRING | |
| bytes <-----> BYTES | |
| ByteString ------> BYTES | |
| Timestamp <-----> LogicalType(urn="beam:logical_type:micros_instant:v1") | |
| Decimal <-----> LogicalType(urn="beam:logical_type:fixed_decimal:v1") | |
| Mapping <-----> MapType | |
| Sequence <-----> ArrayType | |
| NamedTuple <-----> RowType | |
| beam.Row ------> RowType |
When necessary, Python classes are created to represent a portable type. For example, see Timestamp below:
| class Timestamp(object): |
There are some missing portable types in the Python SDK (e.g. Date, DateTime, Time) that we should add support for to make the cross-language experience more smooth.
Issue Priority
Priority: 2 (default / most normal work should be filed as P2)
Issue Components
- Component: Python SDK
- Component: Java SDK
- Component: Go SDK
- Component: Typescript SDK
- Component: IO connector
- Component: Beam examples
- Component: Beam playground
- Component: Beam katas
- Component: Website
- Component: Spark Runner
- Component: Flink Runner
- Component: Samza Runner
- Component: Twister2 Runner
- Component: Hazelcast Jet Runner
- Component: Google Cloud Dataflow Runner
RyuSA, jd185367, ravindu-weerakoon, quentin-sommer and VIKTORVAV99