Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -215,19 +215,26 @@ orthoxml-tools to-nhx --infile examples/data/sample-for-nhx.orthoxml --outdir ./
Convert Newick (NHX) format to OrthoXML.

```bash
orthoxml-tools from-nhx --infile path/to/file.nhx --outfile path/to/file.orthoxml
orthoxml-tools from-nhx --infile path/to/file.nhx --outfile path/to/file.orthoxml [--species-encode nhx|underscore]
```

**Options:**
- `--infile <file>`: Specify the input nhx file or files. (at least one file is required).
- You can specify multiple files by providing them as a space-separated list.
- If you provide multiple files, they will be combined into a single OrthoXML output.
- `--outfile <folder>`: Specify the output OrthoXML file (required).
- `--species-encode <nhx|underscore>`: How species/taxonomic levels are encoded in the Newick files.
nhx → Species encoded in NHX comments using S= or T= tags. For example: (A_s1:0.1[&&NHX:conf=0.9:S=s1],B_s2:0.2[&&NHX:conf=0.8:S=s2]);
underscore → Species encoded in leaf labels using underscores (e.g., GeneID_SpeciesID).

**Example:**
```bash
orthoxml-tools from-nhx --infile examples/data/sample.nhx --outfile ./tests_output/from_nhx.orthoxml
orthoxml-tools from-nhx --infile examples/data/sample2.nhx examples/data/sample.nhx --outfile ./tests_output/from_nhx21.orthoxml
orthoxml-tools from-nhx \
--species-encode nhx \
--infile examples/data/sample.nhx \
--outfile tests_output/from_nhx_nhxspecies.orthoxml
```

### 🛠️ CSV to OrthoXML (exploratory feature)
Expand Down
2 changes: 1 addition & 1 deletion examples/data/sample.nhx
Original file line number Diff line number Diff line change
@@ -1 +1 @@
((A_s1:0.1[&&NHX:conf=0.9],B_s2:0.2[&&NHX:conf=0.8])[&&NHX:S=speciesA],C_s3:0.3[&&NHX:S=speciesB]);
((A_s1:0.1[&&NHX:conf=0.9:S=s1],B_s2:0.2[&&NHX:conf=0.8:S=s2])[&&NHX:S=speciesA],C_s3:0.3[&&NHX:S=speciesB]);
15 changes: 13 additions & 2 deletions src/orthoxml/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from orthoxml import __version__
from orthoxml.parsers import process_stream_orthoxml
from orthoxml.converters.to_nhx import orthoxml_to_newick
from orthoxml.converters.from_nhx import orthoxml_from_newicktrees
from orthoxml.converters.from_nhx import (orthoxml_from_newicktrees, nhx_species_encoded_leaf)
from orthoxml.converters.from_orthofinder import convert_csv_to_orthoxml
from orthoxml.custom_parsers import (
BasicStats,
Expand Down Expand Up @@ -178,11 +178,15 @@ def handle_conversion_to_nhx(args):
logger.info("You can visualise each tree using https://beta.phylo.io/viewer/ as extended newick format.")

def handle_conversion_from_nhx(args):
if args.species_encode == "nhx":
species_encode = nhx_species_encoded_leaf
else:
species_encode = None
orthoxml_from_newicktrees(
args.infile,
args.outfile,
label_to_event=None,
label_to_id_and_species=None
label_to_id_and_species=species_encode
)

def handle_conversion_from_orthofinder(args):
Expand Down Expand Up @@ -277,6 +281,13 @@ def main():
required=True,
help="Paths to one or more Newick (NHX) files"
)
converter_from_nhx_parser.add_argument(
"--species-encode",
required=False,
choices=("nhx", "underscore"),
help="Way how species/taxonomic levels are encoded in the input Newick files. 'nhx' means that the "
"species/taxonomic levels are encoded in the Newick file using the NHX comments S= or T=, 'underscore' "
"means that the species/taxonomic levels are encoded in the Newick file using underscores.")
converter_from_nhx_parser.add_argument("--outfile", required=True, help="Path to the output OrthoXML file")
converter_from_nhx_parser.set_defaults(func=handle_conversion_from_nhx)

Expand Down
2 changes: 2 additions & 0 deletions tests/test_cli.sh
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,8 @@ orthoxml-tools to-nhx \
echo -e "\n[10] Test: Newick (NHX) to OrthoXML conversion"
orthoxml-tools from-nhx --infile "$EXAMPLES_DIR/sample.nhx" --outfile "tests_output/from_nhx.orthoxml"
orthoxml-tools from-nhx --infile "$EXAMPLES_DIR/sample2.nhx" "$EXAMPLES_DIR/sample.nhx" --outfile "tests_output/from_nhx21.orthoxml"
orthoxml-tools from-nhx --species-encode "nhx" --infile "$EXAMPLES_DIR/sample.nhx" --outfile "tests_output/from_nhx_nhxspecies.orthoxml"


echo -e "\n[11] Test: Orthofinder CSV to OrthoXML conversion"
orthoxml-tools from-csv --infile examples/data/InputOrthogroups.csv --outfile tests_output/orthofinder.orthoxml
Expand Down