Skip to content

A bug regarding the summary JSON output #15

@JohnMMa

Description

@JohnMMa

I noticed if the input FASTQ set does not include all the tags listed in the config file, then the the tag_qc object in summary JSON file (i.e. the one generated by -s) will not terminate properly, causing issues for downstream operations.

The attached files are the FASTQ inputs, summary JSON, and config file for the following:

splitcode -c tags.txt --x-only -C 1 -N 3 -t 2 --summary /home/data_datastore/Analysis/[...]/new_local/s8_summary.json exp000705_sample_8_S8_L001_I1_001.fastq.gz exp000705_sample_8_S8_L001_R1_001.fastq.gz exp000705_sample_8_S8_L001_R2_001.fastq.gz
* Using a list of 341 tags (vector size: 341; map size: 12,832; num elements in map: 12,881)
* will process sample 1: sample_8_S8_L001_I1_001.fastq.gz
                         sample_8_S8_L001_R1_001.fastq.gz
                         sample_8_S8_L001_R2_001.fastq.gz
* processing the reads ...
done
* processed 105 reads

When attempting to read s8_summary.json using standard Python protocol in python 3.10:

 with open("s8_summary.json", 'rt') as fp:
...     json.load(fp)
...
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/home/jma/miniconda3/envs/base/lib/python3.10/json/__init__.py", line 293, in load
    return loads(fp.read(),
  File "/home/jma/miniconda3/envs/base/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/home/jma/miniconda3/envs/base/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/jma/miniconda3/envs/base/lib/python3.10/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 235 column 2 (char 10629)

Comparing this file with the summary JSONs that work properly, I noticed the following in the final lines:

		{"tag": "bead3-92", "distance": 0, "count": 1},
	]
}

whereas the last 3 lines of a summary file that work normally look like this:

		{"tag": "bead3-95", "distance": 0, "count": 14048}
	]
}

Removing the comma at the end of the third last line of s8_summary.json causes restores compatibility with Python. I wonder if that's more of a Python issue, or a bug on splitcode's summary output code?

sample_8_S8_L001_I1_001.fastq.gz
sample_8_S8_L001_R1_001.fastq.gz
sample_8_S8_L001_R2_001.fastq.gz
s8_summary.json
tags.txt

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions