Skip to content

Conversation

@utkarsharma2
Copy link
Contributor

DynamoDBToS3Operator - Add a feature to export the table to a point in time.

closes: #28830

@boring-cyborg boring-cyborg bot added area:providers provider:amazon AWS/Amazon - related issues labels Apr 6, 2023
@utkarsharma2 utkarsharma2 marked this pull request as draft April 6, 2023 05:14
@utkarsharma2 utkarsharma2 marked this pull request as ready for review April 12, 2023 04:55
@utkarsharma2
Copy link
Contributor Author

@eladkal @o-nikolas Please review

@utkarsharma2
Copy link
Contributor Author

FYI - Test cases are failing because of the following issue - #30613

@utkarsharma2 utkarsharma2 force-pushed the DynamoDBToS3Operator branch from 8fcdc46 to 1e215aa Compare April 17, 2023 03:18
@potiuk
Copy link
Member

potiuk commented Apr 24, 2023

Needs conflict resolving too.

Copy link
Contributor

@ferruzzi ferruzzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of comments and non-blocking suggestion, but I like how this turned out. The PascalCase suggestion needs to be fixed, the other two are just thoughts you can feel free to revise or ignore.

I really appreciate your patience with the waiter shenanigans, we both learned something on that!

Co-authored-by: D. Ferruzzi <ferruzzi@amazon.com>
Copy link
Contributor

@ferruzzi ferruzzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👍

@utkarsharma2 utkarsharma2 requested a review from o-nikolas April 27, 2023 05:01
@utkarsharma2 utkarsharma2 marked this pull request as draft April 28, 2023 11:20
Comment on lines 161 to 170
credentials = self.hook.get_credentials()
waiter = self.hook.get_waiter(
"export_table",
client=boto3.client(
"dynamodb",
region_name=client.meta.region_name,
aws_access_key_id=credentials.access_key,
aws_secret_access_key=credentials.secret_key,
),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@ferruzzi
Copy link
Contributor

If the CI is happy, I'm happy 👍 Thanks for sticking with this one.

@potiuk
Copy link
Member

potiuk commented Apr 29, 2023

CI is not happy. @utkarsharma2 -> you need to fix those .

@utkarsharma2 utkarsharma2 marked this pull request as ready for review May 8, 2023 09:22
@utkarsharma2
Copy link
Contributor Author

Hey @potiuk, CI is happy now :)

@potiuk potiuk merged commit fc41661 into apache:main May 8, 2023
@ferruzzi
Copy link
Contributor

ferruzzi commented May 8, 2023

@utkarsharma2 Did you actually try this live? After the merge, the system test fails because this addition is missing the file_size paramter, and after adding that in it fails with the message

ERROR    airflow.task:taskinstance.py:1900 Task failed with exception
Traceback (most recent call last):
  File "/opt/airflow/airflow/providers/amazon/aws/transfers/dynamodb_to_s3.py", line 142, in execute
    self._export_table_to_point_in_time()
  File "/opt/airflow/airflow/providers/amazon/aws/transfers/dynamodb_to_s3.py", line 157, in _export_table_to_point_in_time
    ExportFormat=self.export_format,
  File "/usr/local/lib/python3.7/site-packages/botocore/client.py", line 530, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/lib/python3.7/site-packages/botocore/client.py", line 960, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the ExportTableToPointInTime operation: One or more parameter values were invalid: tableArn is not a valid ARN

Here you are passingdynamodb_table_name instead of the ARN. I'm not sure how this would have worked, and I probably should have noticed that in the review.

ferruzzi added a commit to aws-mwaa/upstream-to-airflow that referenced this pull request May 8, 2023
o-nikolas pushed a commit that referenced this pull request May 8, 2023
@utkarsharma2
Copy link
Contributor Author

utkarsharma2 commented May 9, 2023

@ferruzzi Yes I did test it locally, I was passing table ARN in place of the table name, and file_size but forgot to add it in the example :/
Below is the code that worked for me.

  dynamodb_to_s3_operator = DynamoDBToS3Operator(
      task_id="dynamodb_to_s3",
      dynamodb_table_name="arn:aws:dynamodb:us-east-1:633294268925:table/test",
      s3_bucket_name="tmp9",
      file_size=4000,
      export_time=datetime.now(),
      aws_conn_id="aws_default",
      s3_key_prefix="test"
  )

Screenshot 2023-05-09 at 10 55 09 AM

@utkarsharma2
Copy link
Contributor Author

@ferruzzi @o-nikolas Raised a new PR with the Fix Please review - #31142

@ferruzzi
Copy link
Contributor

ferruzzi commented May 9, 2023

Cool, thanks! I'll take a look later today when I get a moment 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:amazon AWS/Amazon - related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Export DynamoDB table to S3 with PITR

6 participants