Skip to content

Include deps columns in experiment table (dataset columns) #1183

@daavoo

Description

@daavoo

It's still WIP but I wanted to open the issue here so you can provide early feedback on what's needed on VSCode side.

In summary, dvc exp show --json will include a new deps field for each revision:

{
        "baseline": {
            "data": {
                "deps": {
                    "copy.py": {
                        "hash": "561f068574ab2a132d304dca3dd6510d",
                        "size": 310,
                        "nfiles": None,
                    }
                },
                "metrics": {"metrics.yaml": {"data": {"foo": 1}}},
                "params": {"params.yaml": {"data": {"foo": 1}}},
                "queued": False,
                "running": False,
                "executor": None,
                "timestamp": None,
            }
        }
}

DVC P.R. is here

treeverse/dvc#7089

Edit by @mattseddon:

For whoever picks this up, the data that comes through for outs/deps is always of the form

{ [path: string]: { hash: string; size: number; nfiles: null | number } }

We will have to construct the tree ourselves from the paths to reflect these in our table in the same way as Studio:

image

Sample output from demo-fashion-mnist
...
    "baseline": {
      "data": {
        "timestamp": "2021-12-17T07:23:19",
        "params": {
          "params.yaml": {
            "data": {
              "train": {
                "batch_size": 128,
                "hidden_units": 64,
                "dropout": 0.4,
                "num_epochs": 10,
                "lr": 0.001,
                "conv_activation": "relu"
              }
            }
          }
        },
        "deps": {
          "src/load_data.py": {
            "hash": "c5787b0916e011ea4a83c7d461325188",
            "size": 240,
            "nfiles": null
          },
          "output/data.pkl": {
            "hash": "35353c71da0985eed8bb507bbc53bd91",
            "size": 54950285,
            "nfiles": null
          },
          "src/train.py": {
            "hash": "b2b36c7878bfafa294bb9a963b97d115",
            "size": 2521,
            "nfiles": null
          },
          "output/model.h5": {
            "hash": "b23662125b47e5f80e4d1fcaf95d63b2",
            "size": 3680480,
            "nfiles": null
          },
          "src/evaluate.py": {
            "hash": "f1de572f6c093813c7344ab3e7c569e9",
            "size": 3236,
            "nfiles": null
          }
        },
        "outs": {
          "output/data.pkl": {
            "hash": "35353c71da0985eed8bb507bbc53bd91",
            "size": 54950285,
            "nfiles": null
          },
          "output/model.h5": {
            "hash": "b23662125b47e5f80e4d1fcaf95d63b2",
            "size": 3680480,
            "nfiles": null
          }
        },
        "queued": false,
        "running": false,
        "executor": null,
        "metrics": {
          "output/metrics.json": {
            "data": {
              "loss": 0.25284987688064575,
              "accuracy": 0.9071000218391418
            }
          }
        },
        "name": "gh-action"
      }
    },
    "8e1ebdb40e6d724c9fd9b1ae0269f63dac331522": {
      "data": {
        "timestamp": "2022-04-13T13:25:30",
        "params": {
          "params.yaml": {
            "data": {
              "train": {
                "batch_size": 128,
                "hidden_units": 64,
                "dropout": 0.4,
                "num_epochs": 10,
                "lr": 0.001,
                "conv_activation": "relu"
              }
            }
          }
        },
        "deps": {
          "src/load_data.py": {
            "hash": "c5787b0916e011ea4a83c7d461325188",
            "size": 240,
            "nfiles": null
          },
          "output/data.pkl": {
            "hash": "35353c71da0985eed8bb507bbc53bd91",
            "size": 54950285,
            "nfiles": null
          },
          "src/train.py": {
            "hash": "b2b36c7878bfafa294bb9a963b97d115",
            "size": 2521,
            "nfiles": null
          },
          "output/model.h5": {
            "hash": "3694740051aef8347e2c0216c6c2b60e",
            "size": 3680480,
            "nfiles": null
          },
          "src/evaluate.py": {
            "hash": "f1de572f6c093813c7344ab3e7c569e9",
            "size": 3236,
            "nfiles": null
          }
        },
        "outs": {
          "output/data.pkl": {
            "hash": "35353c71da0985eed8bb507bbc53bd91",
            "size": 54950285,
            "nfiles": null
          },
          "output/model.h5": {
            "hash": "3694740051aef8347e2c0216c6c2b60e",
            "size": 3680480,
            "nfiles": null
          }
        },
        "queued": false,
        "running": false,
        "executor": null,
        "metrics": {
          "output/metrics.json": {
            "data": {
              "loss": 0.24975043535232544,
              "accuracy": 0.9082000255584717
            }
          }
        },
        "name": "exp-c6871"
      }
    }
  }
}

We will probably need to add a separate tree into the activity bar to select/de-select these "columns" in the experiments webview.

Metadata

Metadata

Assignees

Labels

A: experimentsArea: experiments table webview and everything relatedA: integrationArea: DVC integration layerstoryProduct feature aka epic. Discussion, progress, checkboxes for implementation, etc🎨 designNeeds design input or is being actively worked on

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions