feat: Add S3, Postgres & Kafka handlers#51
Conversation
| SchemaType = TypeVar("SchemaType", bound=pa.SchemaModel) # TODO: utilise this | ||
|
|
||
|
|
||
| class BaseResource(BaseModel, ABC): |
There was a problem hiding this comment.
Actually, now that I think about it, it seems that BaseResource is both a configuration class, and it contains IO logic... Is this necessary? Could we not pass in a pydantic config object into the resource class instead of inheriting from BaseModel?
There was a problem hiding this comment.
So, this really is the crux of the matter. There are several ways to go about this. My first implementation was along the lines of create a resource (just a pydantic model, no functions) and then pass it to a read/write function that takes a resource of a specific type. The reason I combined these into one, was to keep the use of it simple. We can maybe achieve this with different means.
@deaglancrew and I had a long discussion about using a creational builder pattern here, separationg out configuration, io, injection, validation and logging into their own classes. We did not yet find a technical solution for having a nice API for it and connecting the dots between injection and reading wasn't figured out either. Imo, there might be a tradeoff between usability+readability on one side and testability/solid on the other. Let's have a look at what we can do here in our pairing later.
There was a problem hiding this comment.
From an OOP perspective, I'd love to have what Declan suggested. Separate all the responsibilities out into their own classes. But I'm not 100% sure what the repercussions are in terms of API usability 🤔 I've seen a lot of open source repos actually implementing a "frontend layer." So you do whatever is necessary from an OOP perspective on the backend, but then an additional "frontend" layer is purely responsible for the API. We could adopt a similar pattern
There was a problem hiding this comment.
Actually, perhaps this makes sense. The "frontend" is a separate concern...
There was a problem hiding this comment.
A facade pretty much. I'd be happy with this - once I figure out an example of how to do it.
There was a problem hiding this comment.
Maybe we could find some inspiration on github
| from pandera import SchemaModel | ||
|
|
||
|
|
||
| class IOConfig(Protocol): |
There was a problem hiding this comment.
Very Pythonic usage of protocols instead of ABC metaclasses!
There was a problem hiding this comment.
Yeah, imo we should be using ABC. XD
This adds more handlers.
Things that are still missing: Batched io, logging/metrics, 2 special validations
Jira ticket
Quick Start
What your code might look like without dynamicio
Using dynamicio
That looks pretty similar. What else can you do?
Validation
Is handled by pandera, docs here. Resources expect a
pa_schemaby default, either passed when you create the resource or perform io actions. You can opt out by settingallow_no_schematoTrue.Resource classes
Resources are pydantic models, which means any necessary configuration is validated and can be completed by your IDE when you create a resource. They are inherited from one BaseResource godobject, that looks like so:
Available resources
Injection
dynamicio supports injection into paths/strings when using
{var1}or[[var2]]syntax. We should only have one of those..injectreturns a modified copy of your resource..injectshould be called.clone_with_injections.inject(**os.environ)if you really want :( This follows the old method.KeyedResources (formerly 'environments')
KeyedResource exist so you can have different keyed resources. You can set the key, which is also an immutable action, much like inject. Example:
Previously we called these 'environments'. So for most use cases it was either 'cloud' or 'local'. This can be used for testing like it was previousy.
However, This would be great to have special dev environments, for example, No db/services to spin up in dev. Once testing is delegated to uhura, this would prbably be the only use case.
Other:
.set_key_from_envas well, but I think that shouldn't exist.