Use centralized apply-geolocation-rules#166
Merged
Conversation
jameshadfield
approved these changes
Aug 7, 2023
victorlin
commented
Aug 8, 2023
Member
Author
There was a problem hiding this comment.
Ingest failed locally with the following error:
/bin/bash: ./vendored/apply-geolocation-rules: /usr/bin/env: bad interpreter: Permission denied
Full error details
Snakemake log:
Error in rule transform:
jobid: 1
input: data/sequences.ndjson, data/all-geolocation-rules.tsv, source-data/annotations.tsv
output: data/metadata_raw.tsv, data/sequences.fasta
log: logs/transform.txt (check log file(s) for error details)
shell:
(cat data/sequences.ndjson | ./vendored/transform-field-names --field-map collected=date submitted=date_submitted genbank_accession=accession submitting_organization=institution | augur curate normalize-strings | ./bin/transform-strain-names --strain-regex ^.+$ --backup-fields accession | ./bin/transform-date-fields --date-fields date date_submitted --expected-date-formats %Y %Y-%m %Y-%m-%d %Y-%m-%dT%H:%M:%SZ | ./vendored/transform-genbank-location | ./bin/transform-string-fields --titlecase-fields region country division location --articles and d de del des di do en l la las le los nad of op sur the y --abbreviations USA | ./vendored/transform-authors --authors-field authors --default-value ? --abbr-authors-field abbr_authors | ./vendored/apply-geolocation-rules --geolocation-rules data/all-geolocation-rules.tsv | ./vendored/merge-user-metadata --annotations source-data/annotations.tsv --id-field accession | ./bin/ndjson-to-tsv-and-fasta --metadata-columns accession genbank_accession_rev strain date region country division location host date_submitted sra_accession abbr_authors reverse authors institution --metadata data/metadata_raw.tsv --fasta data/sequences.fasta --id-field accession --sequence-field sequence ) 2>> logs/transform.txt
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Removing output files of failed job transform since they might be corrupted:
data/metadata_raw.tsv, data/sequences.fasta
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2023-08-08T152252.303349.snakemake.log
Contents of logs/transform.txt:
/bin/bash: ./vendored/apply-geolocation-rules: /usr/bin/env: bad interpreter: Permission denied
Traceback (most recent call last):
File "/nextstrain/build/./vendored/transform-authors", line 65, in <module>
json.dump(record, stdout, allow_nan=False, indent=None, separators=',:')
File "/usr/local/lib/python3.10/json/__init__.py", line 180, in dump
fp.write(chunk)
BrokenPipeError: [Errno 32] Broken pipe
Traceback (most recent call last):
File "/nextstrain/build/./bin/transform-string-fields", line 83, in <module>
json.dump(record, stdout, allow_nan=False, indent=None, separators=',:')
File "/usr/local/lib/python3.10/json/__init__.py", line 180, in dump
fp.write(chunk)
BrokenPipeError: [Errno 32] Broken pipe
Traceback (most recent call last):
File "/nextstrain/build/./vendored/transform-genbank-location", line 42, in <module>
json.dump(record, stdout, allow_nan=False, indent=None, separators=',:')
File "/usr/local/lib/python3.10/json/__init__.py", line 180, in dump
fp.write(chunk)
BrokenPipeError: [Errno 32] Broken pipe
Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>
BrokenPipeError: [Errno 32] Broken pipe
Traceback (most recent call last):
File "/nextstrain/build/./bin/transform-date-fields", line 153, in <module>
json.dump(record, stdout, allow_nan=False, indent=None, separators=',:')
File "/usr/local/lib/python3.10/json/__init__.py", line 180, in dump
fp.write(chunk)
BrokenPipeError: [Errno 32] Broken pipe
Traceback (most recent call last):
File "/nextstrain/build/./bin/transform-strain-names", line 49, in <module>
json.dump(record, stdout, allow_nan=False, indent=None, separators=',:')
File "/usr/local/lib/python3.10/json/__init__.py", line 180, in dump
fp.write(chunk)
BrokenPipeError: [Errno 32] Broken pipe
Traceback (most recent call last):
File "/nextstrain/augur/augur/__init__.py", line 66, in run
return args.__command__.run(args)
File "/nextstrain/augur/augur/curate/__init__.py", line 192, in run
dump_ndjson(modified_records)
File "/nextstrain/augur/augur/io/json.py", line 64, in dump_ndjson
print(as_json(item))
BrokenPipeError: [Errno 32] Broken pipe
An error occurred (see above) that has not been properly handled by Augur.
To report this, please open a new issue including the original command and the error above:
<https://github.com/nextstrain/augur/issues/new/choose>
Traceback (most recent call last):
File "/nextstrain/build/./vendored/transform-field-names", line 47, in <module>
json.dump(record, stdout, allow_nan=False, indent=None, separators=',:')
File "/usr/local/lib/python3.10/json/__init__.py", line 180, in dump
fp.write(chunk)
BrokenPipeError: [Errno 32] Broken pipe
This is because I didn't add the execute bit on the script in nextstrain/shared#4. Will fix in that repo then update this PR.
subrepo: subdir: "ingest/vendored" merged: "5d90818" upstream: origin: "https://github.com/nextstrain/ingest" branch: "main" commit: "5d90818" git-subrepo: version: "0.4.6" origin: "https://github.com/ingydotnet/git-subrepo" commit: "110b9eb"
The centralized version of the script is a copy of the existing one with some backwards-compatible changes¹. ¹ nextstrain/shared@3b69a10...0ac9a4f
c2f0184 to
81ab6b1
Compare
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of proposed changes
There have been a few updates to nextstrain/ingest since #164 was merged. I pulled those in with
git subrepo, and swapped over to use the one newly added script.Related issue(s)
Follow-up to #164.
Testing