Shiva library: Implementation in Rust of a parser and generator for documents of any type
- Common Document Model (CDM) for all document types
- Parsers produce CDM
- Generators consume CDM
| Document type | Parse | Generate |
|---|---|---|
| Plain text | + | + |
| Markdown | + | + |
| HTML | + | + |
| + | + | |
| JSON | + | + |
| XML | + | + |
| CSV | + | + |
| RTF | + | + |
| DOCX | + | + |
| XLS | + | - |
| XLSX | + | + |
| ODS | + | + |
| Typst | - | + |
| Document type | Header | Paragraph | List | Table | Image | Hyperlink | PageHeader | PageFooter |
|---|---|---|---|---|---|---|---|---|
| Plain text | - | + | - | - | - | - | - | - |
| Markdown | + | + | + | + | + | + | - | - |
| HTML | + | + | + | + | + | + | - | - |
| - | + | + | - | - | - | - | - | |
| DOCX | + | + | + | + | - | + | - | - |
| RTF | + | + | + | + | - | + | + | + |
| JSON | + | + | + | + | - | + | + | + |
| XML | + | + | + | + | + | + | + | + |
| CSV | - | - | - | + | - | - | - | - |
| XLS | - | - | - | + | - | - | - | - |
| XLSX | - | - | - | + | - | - | - | - |
| ODS | - | - | - | + | - | - | - | - |
| Document type | Header | Paragraph | List | Table | Image | Hyperlink | PageHeader | PageFooter |
|---|---|---|---|---|---|---|---|---|
| Plain text | + | + | + | + | - | + | + | + |
| Markdown | + | + | + | + | + | + | + | + |
| HTML | + | + | + | + | + | + | - | - |
| + | + | + | + | + | + | + | + | |
| DOCX | + | + | + | + | + | + | - | - |
| RTF | + | + | + | + | + | + | - | - |
| JSON | + | + | + | + | - | + | + | + |
| XML | + | + | + | + | + | + | + | + |
| CSV | - | - | - | + | - | - | - | - |
| XLSX | - | - | - | + | - | - | - | - |
| ODS | - | - | - | + | - | - | - | - |
| Typst | + | + | + | + | + | + | + | + |
Cargo.toml
[dependencies]
shiva = { version = "1.4.9", features = ["html", "markdown", "text", "pdf", "json",
"csv", "rtf", "docx", "xml", "xls", "xlsx", "ods", "typst"] }main.rs
fn main() {
let input_vec = std::fs::read("input.html").unwrap();
let input_bytes = bytes::Bytes::from(input_vec);
let document = shiva::html::Transformer::parse(&input_bytes).unwrap();
let output_bytes = shiva::markdown::Transformer::generate(&document).unwrap();
std::fs::write("out.md", output_bytes).unwrap();
}git clone https://github.com/igumnoff/shiva.git
cd shiva/cli
cargo build --releasecd ./target/release/
./shiva README.md README.htmlcd ./target/release/
./shiva-server --port=8080 --host=127.0.0.1I would love to see contributions from the community. If you experience bugs, feel free to open an issue. If you would like to implement a new feature or bug fix, please follow the steps:
- Do fork
- Add comment to the issue that you are going to work on it
- Create pull request
If you would like add new document type, you need to implement the following traits:
pub trait TransformerTrait {
fn parse(document: &Bytes) -> anyhow::Result<Document>;
fn generate(document: &Document) -> anyhow::Result<Bytes>;
}Optional: shiva::core::TransformerWithImageLoaderSaverTrait (If images store outside of document for example: HTML, Markdown)
pub trait TransformerWithImageLoaderSaverTrait {
fn parse_with_loader<F>(document: &Bytes, image_loader: F) -> anyhow::Result<Document>
where F: Fn(&str) -> anyhow::Result<Bytes>;
fn generate_with_saver<F>(document: &Document, image_saver: F) -> anyhow::Result<Bytes>
where F: Fn(&Bytes, &str) -> anyhow::Result<()>;
}Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in Shiva by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

