Why confidence and the distance for an original video is coming Low and High respectively?

@joonson I have some doubt in the code of SyncNetInstance.py.

![Screenshot 2024-04-08 132416](https://github.com/joonson/syncnet_python/assets/123265273/54b944f3-6d8c-4e07-bde3-edb08e9da759)

In the function calc_pdist the reason to consider the window it to take the consideration of the offset right?
The way you are computing this distance it would return you the shape of  (lastframe, window_size)  when you perform torch.stack(dists,1) and then later you find mdist and I am unable to understand the logic behind this computation in the code you have done mdist = torch.mean(torch.stack(dists,1),1) i.e., you have taken the average across the column which gives you the mdist of the shape(1,31) i.e., simply list of 31 values.
**Would you please give your views on why have you taken the mean across column because from my understanding the mean should be taken across rows then it would be of shape (lastframe, 1) i.e., mean for each frame while considering a window.**

Also I have performed an Experiment in which I have computed the distance and confidence for an original file which was not dubbed and for that the distance I am getting is pretty high and confidence is very low but it supposed to be the distance would be coming low and the confidence should be high and then I have created a dubbed video of an speaker saying the same statement said in the original file using wave2lip model and then computed the distance and confidence and this distance is comparable lower with respect to the distance computed for original video.
What would be the reason for this?

**Please give your views on why taking the mean across column not across rows?**



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Why confidence and the distance for an original video is coming Low and High respectively? #66

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Why confidence and the distance for an original video is coming Low and High respectively? #66

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions