Conversation
| $error = null; | ||
| foreach ($responses as $idx => $response) { | ||
| try { | ||
| /** @var CopyPartResult $copyPartResult */ |
There was a problem hiding this comment.
this is totally useless. The getter already define its return type properly.
There was a problem hiding this comment.
If the static analysis or the IDE does not properly detect the type of $response based on assignments in the array, the acceptable thing would be added @var array<int, UploadPartCopyOutput> $responses on the initialization of the empty array.
There was a problem hiding this comment.
here is the same situation,
exception is handled in the same place
There was a problem hiding this comment.
My comment was not about the error handling. It was about the @var comment
| return; | ||
| } | ||
|
|
||
| /** @var string $uploadId */ |
There was a problem hiding this comment.
this is useless. getUploadId already define its proper return type.
There was a problem hiding this comment.
Hello.
How can I handle this errors
https://github.com/async-aws/aws/actions/runs/6642786472/job/18048337300?pr=1592
without doc blocs?
There was a problem hiding this comment.
Well, if the AWS API can indeed return string|null for the upload id, you must actually handle the case of null properly instead of pretending that the type inference is wrong and forcing the type checkers to consider the variable as having the type string.
If the AWS API actually returns null, the code would still be broken at runtime and the type override added there only prevents the type checker for detecting the broken code. It does not make it less broken.
There was a problem hiding this comment.
As I understand from AWS DOC (https://docs.aws.amazon.com/AmazonS3/latest/API/API_CreateMultipartUpload.html)
If the action is successful, the service sends back an HTTP 200 response.
Upload Id will be always string, otherwise exception would be thrown
underlying client handle case with uploadId == null and throws an exception.
I don't see why this code should be broken, exception will be thown any way
the solution was taken from here
https://github.com/async-aws/aws/blob/master/src/Integration/Aws/SimpleS3/src/SimpleS3Client.php#L104
| foreach ($responses as $idx => $response) { | ||
| try { | ||
| $copyPartResult = $response->getCopyPartResult(); | ||
| $parts[] = new CompletedPart(['ETag' => $copyPartResult->getEtag(), 'PartNumber' => $idx]); |
There was a problem hiding this comment.
The weird thing about the PossiblyNullReference error reported by Psalm here is that https://docs.aws.amazon.com/AmazonS3/latest/API/API_UploadPartCopy.html#API_UploadPartCopy_ResponseElements documents CopyPartResult as required in the output. So it is weird that the operation was generated with a nullable type. Maybe the AWS SDK has metadata that does not match the documentation there.
There was a problem hiding this comment.
It seems there are no "required" fields for outputs
https://raw.githubusercontent.com/aws/aws-sdk-php/3.283.11/src/data/s3/2006-03-01/api-2.json
There was a problem hiding this comment.
I suggest reporting this to AWS as it looks like a mismatch between the human-readable documentation and the machine-readable data used to generate the SDKs.
There was a problem hiding this comment.
There was a problem hiding this comment.
what should I do?
Add additional checking? Or will wait for AWS fix?
We are waiting for this functionality
There was a problem hiding this comment.
Hi @stof
I wanted to bring to your attention an ongoing issue regarding AWS that was posted approximately 4 months ago. It seems that progress on resolving this matter has been slower than anticipated.
Considering the delay in fixing the manifests, I'd like to propose a potential solution. Would it be feasible to remove the annotations related to this issue from the code and handle the error suppression in the Psalm baseline instead? I've come across a few similar issues in the baseline that could potentially benefit from this approach.
I believe this adjustment could provide a temporary workaround while we await a resolution from the AWS side. However, I'm open to discussing alternative solutions or any concerns you may have regarding this proposal.
There was a problem hiding this comment.
Given that AWS has not yet clarified what is right between their online documentation and the metadata used to generate all their official SDKs (and then reused to generate our async-aws SDK), I think we should not blindly assume that the online doc is the right one. Instead, we should handle null values properly as required by static analysis (you're also doing it for the upload id above btw).
Note that we don't have any comparison point in official SDKs that are properly typed (I'm excluding aws-sdk-php from that definition) to see how they handle CopyPartResult being absent in the result:
- the TransferManager in the official Java SDK has a
copymethod but it is not actually implemented: https://github.com/aws/aws-sdk-java-v2/blob/master/services-custom/s3-transfer-manager/src/main/java/software/amazon/awssdk/transfer/s3/S3TransferManager.java#L685-L687 - equivalent features in other SDKs (the
@aws-sdk/lib-storageof the JS SDK or the TransferManager classes of some other languages) don't have a copy function at all.
|
Nice feature to have 👍 |
|
ping |
| * CacheControl?: string, | ||
| * Metadata?: array<string, string>, | ||
| * PartSize?: positive-int, | ||
| * Concurrency?: positive-int, |
There was a problem hiding this comment.
please use int, this syntax is not a standard, therefore not understood by all IDE.
| $sourceHead = $this->headObject(['Bucket' => $srcBucket, 'Key' => $srcKey]); | ||
| $contentLength = (int) $sourceHead->getContentLength(); | ||
| $options['ContentType'] = $sourceHead->getContentType(); | ||
| $concurrency = (int) ($options['Concurrency'] ?? 10); |
There was a problem hiding this comment.
no need to cast, as it's an int by contract. Let people deal with PHP runtime errors if they misuse the method.
| $s3->upload('my-image-bucket', 'photos/cat_2.txt', 'I like this cat'); | ||
|
|
||
| // Copy objects between buckets | ||
| $s3->copy('source-bucket', 'source-key', 'destination-bucket', 'destination-key'); |
There was a problem hiding this comment.
Options parameters deserve a small documentation IMHO
| * Metadata?: array<string, string>, | ||
| * PartSize?: positive-int, | ||
| * Concurrency?: positive-int, | ||
| * mupThreshold?: positive-int, |
There was a problem hiding this comment.
For consistency
| * mupThreshold?: positive-int, | |
| * MupThreshold?: positive-int, |
| unset($options['PartSize']); | ||
|
|
||
| // If file is less than multipart upload threshold, use normal atomic copy | ||
| if ($contentLength < $mupThreshold) { |
There was a problem hiding this comment.
do we really need a parameter for this?
Is it an issue to split the file even if it is smaller than the default 2GB?
| * files smaller than 64 * 10 000 = 640GB. If you are coping larger files, | ||
| * please set PartSize to a higher number, like 128, 256 or 512. (Max 4096). | ||
| */ | ||
| $partSize = ($options['PartSize'] ?? 64) * $megabyte; |
There was a problem hiding this comment.
If PartSize is not defined, I'd rather default to max(64 * $megabyte , 2 ** ceil(log($contentLength / 10000, 2)));
| } catch (\Throwable $e) { | ||
| $error = $e; | ||
|
|
||
| break; | ||
| } |
There was a problem hiding this comment.
you could move the try/catch outside the loop for more readability, and removing the error variable.
try {
foreach ($responses as $idx => $response) {
$parts[] = new CompletedPart(['ETag' => $response->getCopyPartResult()->getEtag(), 'PartNumber' => $idx]);
} catch (\Throwable $e) {
foreach ($responses as $response) {
try {
$response->cancel();
} catch (\Throwable $e) {
continue;
}
}
$this->abortMultipartUpload(AbortMultipartUploadRequest::create(['Bucket' => $destBucket, 'Key' => $destKey, 'UploadId' => $uploadId]));
}
| --$parallelChunks; | ||
| } | ||
| $error = null; | ||
| foreach ($responses as $idx => $response) { |
There was a problem hiding this comment.
this is not efficient. We define 10 concurents upload, but in fact, we are uploading 10, waiting for the 10 to finish, and uploading 10 more, ...
That means:
- we are waiting for the 10th to finish before starting the 9 previous.
- if 1 upload takes time to finish, we don't leverage the 9 others.
A Beter implementation would be to use a pool (or buffer), to fill the pool with 10 responses, and once 1 response is over, add a new one before checking the other responses.
Better: leverage async processing, by not waiting for the end of a response, but checking if the response is over, if not check the next one.
while not all part uploaded
while pool not full
=> start new upload
while pool full
foreach response in pull
if response not finished
continue
else
=> process response
=> remove response from pool
foreach response in pull
=> wait for response to finish
=> process response
=> remove response from pool
| unset($options['PartSize']); | ||
|
|
||
| // If file is less than multipart upload threshold, use normal atomic copy | ||
| if ($contentLength < $mupThreshold) { |
There was a problem hiding this comment.
this threshold should not be configurable IMO. upload does not let the caller choose the threshold for switching to a simple upload instead of a multipart one.
| $error = null; | ||
| foreach ($responses as $idx => $response) { | ||
| try { | ||
| /** @var CopyPartResult $copyPartResult */ |
There was a problem hiding this comment.
My comment was not about the error handling. It was about the @var comment
| foreach ($responses as $idx => $response) { | ||
| try { | ||
| $copyPartResult = $response->getCopyPartResult(); | ||
| $parts[] = new CompletedPart(['ETag' => $copyPartResult->getEtag(), 'PartNumber' => $idx]); |
There was a problem hiding this comment.
Given that AWS has not yet clarified what is right between their online documentation and the metadata used to generate all their official SDKs (and then reused to generate our async-aws SDK), I think we should not blindly assume that the online doc is the right one. Instead, we should handle null values properly as required by static analysis (you're also doing it for the upload id above btw).
Note that we don't have any comparison point in official SDKs that are properly typed (I'm excluding aws-sdk-php from that definition) to see how they handle CopyPartResult being absent in the result:
- the TransferManager in the official Java SDK has a
copymethod but it is not actually implemented: https://github.com/aws/aws-sdk-java-v2/blob/master/services-custom/s3-transfer-manager/src/main/java/software/amazon/awssdk/transfer/s3/S3TransferManager.java#L685-L687 - equivalent features in other SDKs (the
@aws-sdk/lib-storageof the JS SDK or the TransferManager classes of some other languages) don't have a copy function at all.
SimpleS3Client::copy method added
Method resolve which api it should use by itself (copyObject as atomic or copy by parts)