Skip to content

feat(bindings/c): framework of add basic io and init logics#1861

Merged
Xuanwo merged 4 commits intoapache:mainfrom
xyjixyjixyji:c-basicio
Apr 7, 2023
Merged

feat(bindings/c): framework of add basic io and init logics#1861
Xuanwo merged 4 commits intoapache:mainfrom
xyjixyjixyji:c-basicio

Conversation

@xyjixyjixyji
Copy link
Copy Markdown
Contributor

@xyjixyjixyji xyjixyjixyji commented Apr 5, 2023

  • Add the init logics for operator
  • Add the basic io operations (r/w)
  • Add the test for basicio

Fixes: #1201

I want to raise this PR for further discussions. And I think(maybe?) a user guide and developer guide for C bindings is needed for style consistency.

Some explainations:

  1. Why we have an OperatorPtr type.

    • This is because cbindgen does not support arbitrary types. Therefore, I store the Operator's address inside the pointer, and use get_ref to provide a immutable reference to its underlying BlockingOperator.
  2. In C, librarys normally have prefix, I naively uses od as the function prefix and opendal as struct prefix.

  3. Why we have a Vector type.

    • This is related to the same problem. The cbindgen does not support Vec<T> from Rust, therefore I uses a quite brute force way of a C-compatible alternative.

@xyjixyjixyji
Copy link
Copy Markdown
Contributor Author

Also, I would appreciate it if anyone can share some ideas on the construction of hashmap used to construct operator, since cbindgen does not support HashMap.

Storing it by pointer is obviously not 100% safe since the hashmap might be reallocated upon insertion and deletion. But IMO that might be acceptable since the map will only be used to construct operator.

I have another idea might be better, which is it could accept a sequence of Key-Value pairs separately in the argument and construct the hashmap in our Rust code.

@Xuanwo
Copy link
Copy Markdown
Member

Xuanwo commented Apr 6, 2023

Also, I would appreciate it if anyone can share some ideas on the construction of hashmap used to construct operator, since cbindgen does not support HashMap.

That's a good question. I don't have any ideas yet, but we could focus on implementing memory service support first. Ideally, we should avoid supporting initialization backends through hashmaps in C.

Comment thread bindings/c/include/opendal.h Outdated
Comment thread bindings/c/include/opendal.h Outdated
Comment thread bindings/c/include/opendal.h Outdated
Comment thread bindings/c/include/opendal.h Outdated
Comment thread bindings/c/include/opendal.h Outdated
Comment thread bindings/c/include/opendal.h Outdated
@xyjixyjixyji xyjixyjixyji force-pushed the c-basicio branch 2 times, most recently from f52c6d6 to e2db532 Compare April 6, 2023 06:41
* Add the init logics for operator
* Add the basic io operations (r/w)
* Add the test for basicio

Fixes: apache#1201
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
* The hello_opendal() works as a place holder for c binding,
  remove it since it will not be used anymore

Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
* Previously the type opendal_operator_ptr uses a C layout, i.e.,
  it STORES a pointer inside it. This commit changes it to transparent
  layout, which guarantees it has the SAME layout of the inner pointer.
  This allows the opendal_operator_ptr uses native boolean operation to
  check validity. e.g. (!ptr) means invalid.

Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
@xyjixyjixyji xyjixyjixyji force-pushed the c-basicio branch 2 times, most recently from 849d253 to d65a3c3 Compare April 6, 2023 14:47
@xyjixyjixyji
Copy link
Copy Markdown
Contributor Author

I have added the Result and Error types for error handling. Error handling with strings is tricky in C, so the code is a litlle bit complex, making this a relatively big PR. Thanks everyone in advance for spending time reviewing this.

Some explainations:

  • I did not use generics on Result type even cbindgen supports it because the naming of generics in cbindgen is unacceptable.
  • I allocates the error on heap since it contains error message.
  • I use Rust-like Result type in C since I found it very elegant for error handling.

I would really appreciate any review and suggestions on the PR and feel free to contact me for any changes or questions.

Comment thread bindings/c/include/opendal.h Outdated
Comment thread bindings/c/include/opendal.h Outdated
Comment thread bindings/c/include/opendal.h Outdated
Comment thread bindings/c/include/opendal.h Outdated
Comment thread bindings/c/include/opendal.h Outdated
Comment thread bindings/c/include/opendal.h Outdated
Comment thread bindings/c/src/error.rs Outdated
Comment thread bindings/c/src/error.rs Outdated
Comment thread bindings/c/src/error.rs Outdated
@xyjixyjixyji
Copy link
Copy Markdown
Contributor Author

xyjixyjixyji commented Apr 7, 2023

Hi @Xuanwo , I think carefully about the so called error thing and have several things want to discuss. (By the way, I think the opendal_code would be a good practice and I have decided to use that xD)

  1. The API of opendal_operator_new
  • You proposed a two-stage API for building an operator, I have following thoughts on that.
    • IMO the API you proposed means that operator_new returns a operator_ptr stores necessary information to build a operator_ptr with which we use operator_build to further build it and return the error code. I think the operator_ptr should only stores the pointer to the actual operator
      • Therefore, I have some ideas on this problem for this PR.
        1. For this PR, we just merge the operator_new function, which returns a operator_ptr. If anything fails, the operator_ptr will point to NULL.
        2. The two-stage API you propose is no doubt a good practice to propagate error, my idea will be APIs like
struct opendal_operator_builder* opendal_operator_new(char* scheme);
opendal_code opendal_operator_build(opendal_operator_builder* builder);

struct opendal_operator_builder {
    struct opendal_operator_ptr ptr;
    char* scheme;
    ...
}

let the builder stores the actual operator_ptr, which will further be used by user.

  1. The pattern matching of od::ErrorKind

    • The problem is about the ErrorKind definition in core, since it is attributed by #[non_exhaustive]. I suggest remove that attribute so that every update on the od::ErrorKind will let all bindings' compilation fail, working as a reminder for bindings update.
  2. The error message design

    • This is tricky, I think we may leave the error message to further PRs.

@Xuanwo
Copy link
Copy Markdown
Member

Xuanwo commented Apr 7, 2023

2. The two-stage API you propose is no doubt a good practice to propagate error, my idea will be APIs like

Good idea. I do think we need some builder things in c bindings too. But we can leave this to further PRs. Python/Node.js/Java has similiar problems too. Maybe we should do some code generation to address this issue.

And for now, I think it's Ok to have a opendal_operator_new(void) (just for memory service).

  • The problem is about the ErrorKind definition in core, since it is attributed by #[non_exhaustive]. I suggest remove that attribute so that every update on the od::ErrorKind will let all bindings' compilation fail, working as a reminder for bindings update.

No, this error kind is non_exhaustive. It's part of our public API.

Bindings should handle all error kinds. And everytime we add a new error, we should update all bindings. It's not a heavy work and we won't add new kinds every day.

  • This is tricky, I think we may leave the error message to further PRs.

Yep, agreed.

@xyjixyjixyji
Copy link
Copy Markdown
Contributor Author

No, this error kind is non_exhaustive. It's part of our public API.

In the binding codes, the matching has to have a "_" arm, I would use the General Error Code for this arm. Just let you know that every ErrorKind update in core should also invoke manual update on the bindings (this has to be kept in mind by the developer since the compiler will give no error).

@Xuanwo
Copy link
Copy Markdown
Member

Xuanwo commented Apr 7, 2023

In the binding codes, the matching has to have a "_" arm, I would use the General Error Code for this arm.

Just removing this arm (

Sorry for my mistake. Please just leave a panic!() or unreachable!() here. This case should never be reached.

@Xuanwo
Copy link
Copy Markdown
Member

Xuanwo commented Apr 7, 2023

This arm can only be removed if the #[non_exhaustive] attribute is removed, I know why we keep talking about this /

Sorry for my mistake. I have updated my previous comment.

@xyjixyjixyji xyjixyjixyji force-pushed the c-basicio branch 2 times, most recently from e9343b8 to c7be052 Compare April 7, 2023 09:37
@xyjixyjixyji xyjixyjixyji requested a review from Xuanwo April 7, 2023 09:37
Comment thread bindings/c/include/opendal.h
Comment thread bindings/c/include/opendal.h Outdated
@Xuanwo
Copy link
Copy Markdown
Member

Xuanwo commented Apr 7, 2023

The s3 cpp sdk:

bool AwsDoc::S3::PutObject(const Aws::String &bucketName,
                           const Aws::String &fileName,
                           const Aws::Client::ClientConfiguration &clientConfig) {
    Aws::S3::S3Client s3_client(clientConfig);

    Aws::S3::Model::PutObjectRequest request;
    request.SetBucket(bucketName);
    //We are using the name of the file as the key for the object in the bucket.
    //However, this is just a string and can be set according to your retrieval needs.
    request.SetKey(fileName);

    std::shared_ptr<Aws::IOStream> inputData =
            Aws::MakeShared<Aws::FStream>("SampleAllocationTag",
                                          fileName.c_str(),
                                          std::ios_base::in | std::ios_base::binary);

    if (!*inputData) {
        std::cerr << "Error unable to read file " << fileName << std::endl;
        return false;
    }

    request.SetBody(inputData);

    Aws::S3::Model::PutObjectOutcome outcome =
            s3_client.PutObject(request);

    if (!outcome.IsSuccess()) {
        std::cerr << "Error: PutObject: " <<
                  outcome.GetError().GetMessage() << std::endl;
    }
    else {
        std::cout << "Added object '" << fileName << "' to bucket '"
                  << bucketName << "'.";
    }

    return outcome.IsSuccess();
}

Seems we do have to return result for every operation... 😞

@xyjixyjixyji
Copy link
Copy Markdown
Contributor Author

xyjixyjixyji commented Apr 7, 2023

Seems we do have to return result for every operation... 😞

Yeah, looks like this is the common practice. But fortunately there isn't that much operations and we only need one for each of them...

@Xuanwo
Copy link
Copy Markdown
Member

Xuanwo commented Apr 7, 2023

Seems we do have to return result for every operation... disappointed

Yeah, looks like this is the common practice. But fortunately there isn't that much operations and we only need one for each of them...

Sorry for my misleading. Let's do this. This PR is good enough to get merged after #1861 (comment) addressed. We can add them in future PRs.

@Xuanwo
Copy link
Copy Markdown
Member

Xuanwo commented Apr 7, 2023

Last work: please make our checks happy.

@xyjixyjixyji
Copy link
Copy Markdown
Contributor Author

Last work: please make our checks happy.

Every commit of mine passes the cargo clippy and cargo fmt on my local machine, so that should not be a problem. Let's wait for the CI check 😆 😆 😆

* Add error code opendal_error for error handling, which decouples
  the od::ErrorKind

Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
@Xuanwo
Copy link
Copy Markdown
Member

Xuanwo commented Apr 7, 2023

Every commit of mine passes the cargo clippy and cargo fmt on my local machine, so that should not be a problem. Let's wait for the CI check

Please fix https://github.com/apache/incubator-opendal/actions/runs/4638141339/jobs/8207676488?pr=1861

@xyjixyjixyji
Copy link
Copy Markdown
Contributor Author

Every commit of mine passes the cargo clippy and cargo fmt on my local machine, so that should not be a problem. Let's wait for the CI check

Please fix https://github.com/apache/incubator-opendal/actions/runs/4638141339/jobs/8207676488?pr=1861

Sorry, I forget there should be a newline after the licence. That should be fixed now.

Copy link
Copy Markdown
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! Let's keep moving.

@Xuanwo Xuanwo merged commit 920c3d6 into apache:main Apr 7, 2023
@xyjixyjixyji xyjixyjixyji deleted the c-basicio branch April 7, 2023 13:44
@gaby gaby mentioned this pull request May 9, 2023
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OpenDAL C API

2 participants