AWS Lambda (python) for Batch processing of wav files from S3 to S3 #20
Replies: 4 comments
-
|
Some additional content... What seems to be possibly relevant is: |
Beta Was this translation helpful? Give feedback.
-
|
Some more information ( for all ) I tried to hack this by appending a #fragment which should cause everything at the end of the URL to be ignored by the web server ( as the fragment interpretation is done by the browser ) I think speechmatics advice is to 'green-light' the speechmatics source server IPs so that they can write to S3. |
Beta Was this translation helpful? Give feedback.
-
|
Also, does anyone monitor these topics? I'm not seeing much activity here. |
Beta Was this translation helpful? Give feedback.
-
|
OK so it seems confirmed that: |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello all, I'm trying to create an AWS solution to bulk process 10,000 files each day.
Initially I'm developing a single S3 object processor in AWS Lambda ( python ).
( I think later the appropriate choreography architecture might involve AWS Step Functions Distributed Map but I'm not there yet )
I've had some struggles & some successes.
I can't seem to use speechmatics-python library in AWS Lambda as when I package it in a layer it is too big. ( 300MB > 250MB limit )
As Requests is no longer part of the base AWS Lambda, I attempted to use urllib.requests & http.client but I couldn't initially get multipart/form-data working. ( speechmatics since sent me draft way to do this ).
I did finally manage to package Requests as a layer & that is now working ( hooray ).
I'm now currently getting the error:
"message": "Error in sending notification: unable to send notification: Response status: 403, retrying",
I've made the URLs for the s3.get_object & s3.put_object use pre-signed URLs.
The transcript shows up successfully in the Speechmatics Portal when I direct the message to that server.
so I think that the read-get from s3 is working
But the resultant file never shows up on my s3 bucket from either the Speechmatics server nor my own Virtual Appliance.
It would be really great if there was a reference AWS Lambda that did the base case.
I'd be happy to work on this with anyone.
Thanks in advance - Ted
Beta Was this translation helpful? Give feedback.
All reactions