Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
1353391
Upload areas naming changed, using submission envelope UUID, morphic-…
dipayan1985 Apr 2, 2024
ae49d30
Metadata submission support to morphic-util
dipayan1985 Apr 23, 2024
114bf5e
Additional option to link dataset and study, making dataset no-body
dipayan1985 Apr 24, 2024
ee88e9b
fix list and upload, messages improved
dipayan1985 Apr 25, 2024
a11bd57
call the dev API and not localhost and increment version
dipayan1985 Apr 25, 2024
4ddf532
fixing list when there are files and folders both.
dipayan1985 Apr 26, 2024
7795606
fixing delete
dipayan1985 Apr 30, 2024
9e89a39
fix list
dipayan1985 May 13, 2024
2734d03
tsv and csv submission support
dipayan1985 May 15, 2024
db1eaa8
typo in delete
dipayan1985 May 15, 2024
41a916a
submission support added
dipayan1985 Jun 11, 2024
4a22274
metadata spreadsheet submission support added
dipayan1985 Jun 19, 2024
1c927db
metadata spreadsheet submission support added-added library prep support
dipayan1985 Jun 19, 2024
34657ee
metadata spreadsheet submission support - linkage and output file wri…
dipayan1985 Jun 20, 2024
542b3de
metadata spreadsheet submission support - linkage and sequencing file…
dipayan1985 Jun 20, 2024
47b2734
metadata spreadsheet submission support - sequencing file sheet updat…
dipayan1985 Jun 20, 2024
93f34a2
several improvements including file validation
dipayan1985 Jun 25, 2024
7049ee6
update action
dipayan1985 Jun 25, 2024
812757f
delete action
dipayan1985 Jun 27, 2024
5de275f
code improvements, TODOs
dipayan1985 Jul 2, 2024
ea5c4f4
pre-deployment
dipayan1985 Jul 2, 2024
190f510
pre-deployment warnings fix
dipayan1985 Jul 2, 2024
0bdb753
pre-deployment small bug fixes
dipayan1985 Jul 3, 2024
a3cb828
pre-deployment small bug fixes
dipayan1985 Jul 4, 2024
0a9b070
pre-deployment small bug fixes
dipayan1985 Jul 5, 2024
ac27181
use new cognito
dipayan1985 Jul 5, 2024
950d61a
version increment
dipayan1985 Jul 5, 2024
4a87c8c
defect fixes
dipayan1985 Aug 8, 2024
96efda1
defect fixes
dipayan1985 Aug 9, 2024
82149dd
adding expression alteration support
dipayan1985 Aug 13, 2024
0720939
adding expression alteration support and refactoring
dipayan1985 Aug 13, 2024
753a3ad
adding expression alteration update support and more refactoring
dipayan1985 Aug 14, 2024
2a3946a
file validation errors appended to validation errors list
dipayan1985 Aug 14, 2024
94ecc2c
README.md updated
dipayan1985 Aug 14, 2024
37c1722
update version
dipayan1985 Aug 14, 2024
e7f68b4
align data format with schema and update version
dipayan1985 Aug 19, 2024
7ff442a
validate sequencing files uncommented
dipayan1985 Aug 19, 2024
c6f7d53
incr version
dipayan1985 Aug 19, 2024
92ad56b
incr version to 0.0.21 for test pypi issues
dipayan1985 Aug 19, 2024
2adb8d2
incr version to 1.0.0 for new major version with submission support
dipayan1985 Aug 19, 2024
8001ad1
code clean-up
dipayan1985 Aug 31, 2024
1940cdd
code improvements
dipayan1985 Sep 2, 2024
03976d6
handling md5 checksums and new type of sheet having clonal and undiff…
dipayan1985 Sep 3, 2024
15b5e90
better error handling
dipayan1985 Sep 4, 2024
f15c6f2
better error handling
dipayan1985 Sep 5, 2024
24d946f
correct name for expression_alteration_id while object construction
dipayan1985 Sep 6, 2024
ea8b62f
increment version
dipayan1985 Sep 10, 2024
76ff747
improvements
dipayan1985 Sep 30, 2024
db1aab0
Merge pull request #6 from ebi-ait/feature/submission-envelop-uuid-fo…
dipayan1985 Oct 1, 2024
b4c1f97
check if valid dataset is provided
dipayan1985 Oct 3, 2024
fed8578
upgrade version
dipayan1985 Oct 8, 2024
b84ccf9
adapt as per v7 of spreadsheet
dipayan1985 Oct 18, 2024
18b808a
prod
dipayan1985 Nov 8, 2024
010f7d0
md5 sums computation while listing files
dipayan1985 Dec 2, 2024
fa13fd0
don't delete the dataset object
dipayan1985 Dec 9, 2024
fad84a4
Merge pull request #8 from ebi-ait/feature/36-md5-sums-compute-while-…
dipayan1985 Dec 9, 2024
f789643
prod recording related changes
dipayan1985 Mar 19, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
96 changes: 89 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# morphic-util

CLI tool for uploading data to the Morphic AWS S3 buckets.
CLI tool for submitting analysis data and metadata

# Users

Expand All @@ -9,8 +9,8 @@ CLI tool for uploading data to the Morphic AWS S3 buckets.
Users need to have

1. Basic command-line knowledge
2. Python3.x installed on their machine
3. AWS Cognito username and password
2. Python 3.10 installed on their machine
3. AWS Cognito username or email and password

## Install

Expand All @@ -35,8 +35,10 @@ optional arguments:
--version, -v show program's version number and exit

command:
{config,create,select,list,upload,download,delete}
{config,submit,submit-file,create,select,list,upload,download,delete}
config configure AWS credentials
submit submit your study, dataset or biomaterials metadata (incomplete as all metadata types is not supported yet, expected to be completed on August 2024)
submit-file submit your metadata file containing your cell lines, differentiated cell lines, library preparations and sequencing files
create create an upload area (authorised users only)
select select or show the active upload area
list list contents of the area
Expand Down Expand Up @@ -79,18 +81,40 @@ positional arguments:
password AWS Cognito password
```

The tool uses the profile name _hca-util_ in local AWS config files.
The tool uses the profile name _morphic-util_ in local AWS config files.

## `submit` command
Submit your study and dataset metadata and create your AWS upload area for uploading data files

```shell script
positional arguments:
$ morphic-util submit --type <TYPE> --file <PATH_TO_FILE>

--type type of metadata being submitted (e.g. study or dataset)
--file path to the file containing the metadata
```

## `submit-file` command
Submit your study and dataset metadata and create your AWS upload area for uploading data files

```shell script
positional arguments:
$ morphic-util submit-file --file <PATH_TO_FILE> --action <SUBMISSION_ACTION> --dataset <the analyis which has generated the data and the metadata>

--file path to the file containing the metadata
--action ADD, MODIFY or DELETE based on the type of submission
--dataset the identifier for the analysis
```

## `create` command

Create an upload area/ project folder **(authorised users only)**

```shell script
$ morphic-util create NAME DPC [-p {u,ud,ux,udx}]
$ morphic-util create NAME [-p {u,ud,ux,udx}]

positional arguments:
NAME name for the new area/ project folder
DPC center name of the submitter

optional arguments:
-p {u,ud,ux,udx} allowed actions (permissions) on new area. u for
Expand Down Expand Up @@ -161,6 +185,64 @@ optional arguments:
-a delete all files from the area
-d delete upload area and contents (authorised users only)
```
## Performing a submission
### Authenticate
```shell script
$ morphic-util config username password

positional arguments:
username AWS Cognito username
password AWS Cognito password
```
### Create your study
```shell script
positional arguments:
$ morphic-util submit --type study --file <PATH_TO_STUDY_METADATA_FILE>

--type type of metadata being submitted (here it is study)
--file path to the file containing the metadata
```
### Create your dataset and link it to your study
```shell script
positional arguments:
$ morphic-util submit --type dataset --file <PATH_TO_DATASET_METADATA_FILE> --study <STUDY_ID>

--type type of metadata being submitted (here it is dataset)
--file path to the file containing the metadata (optional)
--study STUDY_ID obtained in the last step
```
### `select` your upload area to upload your data files (the upload area name is same as your DATASET_ID)
Show or select the data file upload area
```shell script
$ morphic-util select AREA

positional arguments:
AREA upload area name (same as DATASET_ID obtained in the last step).
```
### `upload` your data files
Upload files to the selected area for the dataset
```shell script
$ morphic-util upload PATH [PATH ...] [-o]

positional arguments:
PATH valid file or directory

optional arguments:
-o overwrite files with same names
```
### `list` uploaded data files to verify that data file upload has been successful
```shell script
$ morphic-util list
```
### `submit-file` command to submit your dataset metadata containing your biomaterials, processes, protocols and files
```shell script
positional arguments:
$ morphic-util submit-file --file <PATH_TO_FILE> --action <SUBMISSION_ACTION> --dataset <the analyis which has generated the data and the metadata>

--file path to the file containing the metadata
--action ADD, MODIFY or DELETE based on the type of submission
--dataset the identifier for the analysis
```

# Developers

Expand Down
21 changes: 19 additions & 2 deletions ait/commons/util/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,21 @@ def parse_args(args):
parser_config.add_argument('PASSWORD', help='AWS Cognito password', nargs='?')
parser_config.add_argument('--bucket', help='use BUCKET instead of default bucket')

parser_config = cmd_parser.add_parser('submit', help='submit your metadata')
parser_config.add_argument('--type', help='data type you are submitting, e.g. study, dataset')
parser_config.add_argument('--file', help='your metadata')
parser_config.add_argument('--study', help='your study reference')
parser_config.add_argument('--dataset', help='your dataset reference')
parser_config.add_argument('--process', help='your process/analysis reference')

parser_config = cmd_parser.add_parser('submit-file', help='submit your file containing your dataset metadata')
parser_config.add_argument('--file', help='spreadsheet containing your dataset metadata')
parser_config.add_argument('--action', help='action you want to perform (ADD/MODIFY/DELETE')
parser_config.add_argument('--dataset', help='your dataset reference')

parser_config = cmd_parser.add_parser('view', help='view your dataset')
parser_config.add_argument('--dataset', help='your dataset reference')

parser_create = cmd_parser.add_parser('create', help='create an upload area (authorised users only)')
parser_create.add_argument('NAME', help='name for the new area', type=valid_project_name)
parser_create.add_argument('DPC', help='center name of the submitter', type=valid_project_name)
Expand All @@ -98,7 +113,8 @@ def parse_args(args):
# parser_clear.add_argument('-a', action='store_true', help='clear all - selection and known dirs')

parser_list = cmd_parser.add_parser('list', help='list contents of the area')
parser_list.add_argument('-b', action='store_true', help='list all areas in the S3 bucket (authorised users only)')
parser_list.add_argument('-processing', action='store_true', help='access the processed data (authorised users '
'only)')

# parser_upload = cmd_parser.add_parser('upload', help='upload files to the area')
# group_upload = parser_upload.add_mutually_exclusive_group(required=True)
Expand Down Expand Up @@ -128,7 +144,8 @@ def parse_args(args):
group_delete.add_argument('-d', action='store_true', help='delete upload area and contents (authorised users only)')

parser_sync = cmd_parser.add_parser('sync',
help='copy data from selected upload area to ingest upload area (authorised users only)')
help='copy data from selected upload area to ingest upload area (authorised '
'users only)')
parser_sync.add_argument('INGEST_UPLOAD_AREA', help='Ingest upload area', type=valid_ingest_upload_area)

ps = [parser]
Expand Down
63 changes: 39 additions & 24 deletions ait/commons/util/aws_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

from ait.commons.util.aws_cognito_authenticator import AwsCognitoAuthenticator
from ait.commons.util.settings import AWS_SECRET_NAME_AK_BUCKET, AWS_SECRET_NAME_SK_BUCKET, \
AWS_SECRET_NAME_MORPHIC_BUCKET, COGNITO_MORPHIC_UTIL_ADMIN, S3_REGION
COGNITO_MORPHIC_UTIL_ADMIN, S3_REGION


def static_bucket_name():
Expand All @@ -14,7 +14,7 @@ def static_bucket_name():
class Aws:

def __init__(self, user_profile):
self.is_user = False # not admin
self.is_user = True # not admin
self.user_dir_list = None
self.center_name = None
self.secret_key = None
Expand Down Expand Up @@ -42,21 +42,23 @@ def get_bucket_name(self, secret_mgr_client):
"""
# access policy can't be attached to a secret
# GetSecretValue action should be allowed for user
resp = secret_mgr_client.get_secret_value(SecretId=AWS_SECRET_NAME_MORPHIC_BUCKET)
resp = secret_mgr_client.get_secret_value(SecretId='')
secret_str = resp['SecretString']
self.bucket_name = json.loads(secret_str)['s3-bucket']
return self.bucket_name

def new_session(self):
aws_cognito_authenticator = AwsCognitoAuthenticator(self)
secret_manager_client = aws_cognito_authenticator.get_secret_manager_client(self.user_profile.username,
self.user_profile.password)
secret_manager_client = aws_cognito_authenticator.secret_manager_client_instance(self.user_profile.username,
self.user_profile.password)

if secret_manager_client is None:
print('Failure while re-establishing Amazon Web Services session, report this error to the DRACC admin')
print(
'Failure while re-establishing Amazon Web Services session, report this error to the MorPhiC DRACC '
'admin')
raise Exception
else:
self.is_user = aws_cognito_authenticator.is_valid_user()
self.is_user = aws_cognito_authenticator.is_user
self.user_dir_list = aws_cognito_authenticator.get_user_dir_list()
self.center_name = aws_cognito_authenticator.get_center_name()

Expand Down Expand Up @@ -85,22 +87,35 @@ def is_valid_credentials(self):
def is_valid_user(self):
return self.is_user

def obj_exists(self, key):
def s3_bucket_exists(self, key):
"""
return true if key exists, else false
A folder/directory is an s3 object with key <uuid>/
Note: s3://my-bucket/folder != s3://my-bucket/folder/
Refer to https://www.peterbe.com/plog/fastest-way-to-find-out-if-a-file-exists-in-s3
for comparison between client.list_objects_v2 and client.head_object to make this check.
Also check https://stackoverflow.com/questions/33842944/check-if-a-key-exists-in-a-bucket-in-s3-using-boto3
which suggests using Object.load() - which does a HEAD request, however, user doesn't have
s3:GetObject permission by default, so this will fail for them.
Returns True if the bucket exists, else False.
"""
response = self.new_session().client('s3').list_objects_v2(
Bucket=self.bucket_name,
Prefix=key,
)
for obj in response.get('Contents', []):
if obj['Key'] == key:
return True
return False
client = self.common_session.client('s3')
try:
client.head_bucket(
Bucket=key
)
return True
except client.exceptions.NoSuchBucket as e:
print(f"The bucket '{key}' does not exist. Reason: {e}")
return False

def data_file_exists(self, bucket_name, key):
"""
Check if an object exists in the specified S3 bucket.

Parameters:
- bucket_name (str): The name of the S3 bucket.
- key (str): The key of the object in the bucket.

Returns:
- bool: True if the object exists, False otherwise.
"""
client = self.common_session.client('s3')

try:
client.head_object(Bucket=bucket_name, Key=key)
return True
except client.exceptions.ClientError:
return False
40 changes: 5 additions & 35 deletions ait/commons/util/aws_cognito_authenticator.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
import sys

import boto3

from ait.commons.util.settings import DEFAULT_PROFILE, DEFAULT_REGION, COGNITO_CLIENT_ID, COGNITO_IDENTITY_POOL_ID, \
COGNITO_USER_POOL_ID
from ait.commons.util.user_profile import set_profile
Expand All @@ -14,12 +12,11 @@ class AwsCognitoAuthenticator:

def __init__(self, args):
self.args = args
self.is_user = False # not admin
self.is_user = True # not admin
self.user_dir_list = None
self.center_name = None # custom attribute DPC

def validate_cognito_identity(self, profile, username, password):

def is_registered_user(self, profile, username, password):
try:
profile = profile if profile else DEFAULT_PROFILE

Expand Down Expand Up @@ -64,7 +61,7 @@ def validate_cognito_identity(self, profile, username, password):

if session_token:
set_profile(profile, DEFAULT_REGION, aws_cred['AccessKeyId'], aws_cred['SecretKey'],
session_token, username, password)
session_token, access_token, username, password)

return True
else:
Expand All @@ -74,8 +71,7 @@ def validate_cognito_identity(self, profile, username, password):
except Exception as e:
return False

def get_secret_manager_client(self, username, password):

def secret_manager_client_instance(self, username, password):
try:
if username and password:
client = boto3.client("cognito-idp", region_name=DEFAULT_REGION, aws_access_key_id="NONE",
Expand All @@ -90,40 +86,14 @@ def get_secret_manager_client(self, username, password):
# Getting the user details.
access_token = response["AuthenticationResult"]["AccessToken"]
id_token = response["AuthenticationResult"]["IdToken"]

response = client.get_user(AccessToken=access_token)

username = response['Username']
user_attribute_list = response['UserAttributes']

if username.endswith('Admin') or username.endswith('admin'):
self.is_user = False
else:
self.is_user = True

for attr in user_attribute_list:
if attr['Name'] == 'custom:DPC':
self.center_name = attr['Value'].lower()

if attr['Name'] == 'custom:directory_access':
self.user_dir_list = attr['Value'].replace(" ", "").split(',')

if self.user_dir_list is not None:
self.user_dir_list = ['morphic-' + self.center_name + '/' + dataset_dir for dataset_dir in
self.user_dir_list]

if self.is_user:
if self.center_name is None:
print('User does not have an assigned center name and therefore cannot perform any operations '
'with this system')
sys.exit(1)

if self.user_dir_list is None:
if self.is_user:
print('User does not have access to any upload areas or to perform any operations with this'
'system')
sys.exit(1)

identity = boto3.client('cognito-identity', region_name=DEFAULT_REGION)

identity_id = identity.get_id(
Expand Down Expand Up @@ -156,7 +126,7 @@ def get_secret_manager_client(self, username, password):
except Exception as e:
return None

def is_valid_user(self):
def is_user(self):
return self.is_user

def get_user_dir_list(self):
Expand Down
Loading