Some question about training FVC with different λ

{
    "tot_epoch": 10000,
    "tot_step": 2300000,
    "single_training_step": 2000000,
    "train_lambda": 2048,

    "lr": {
        "base": 0.00005,
        "decay": [0.3, 0.1, 0.03, 0.01],
        "decay_interval": [1900000, 2250000, 2270000, 2290000]
    }
}
When I trained a model of 2048 using the training strategy provided by the author, the results were similar to those of the pre-trained model. However, when I switched to 512, the results were significantly different. Does anyone know why?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some question about training FVC with different λ #29

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Some question about training FVC with different λ #29

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions