Feat: Voice activity threshold slider#492
Conversation
| @@ -0,0 +1,68 @@ | |||
| .slider[type="range"] { | |||
There was a problem hiding this comment.
We already have a slider in VideoTileSettingsModal, could you please use this there too for consistency?
There was a problem hiding this comment.
Yes 👍
This pull request is a bit messy atm. We will continue to refactor it but currently just a proof of concept level
There was a problem hiding this comment.
The Slider is now the same as in LocalVolume, we also made that slider reusable, so we can use it anywhere we want in the future.
There was a problem hiding this comment.
I think that in the future the slider should be replaced with a slider from a library, since those have more functionality and have better accessibility
|
Some initial bits of feedback on this - mainly that it looks like this is done using the volume analysis: I would think that this is always going to miss the very start of the phrase since it will already have been discarded by the time it's triggered a volume event, so this approach might not work. I'm not an expert on how VAD is done 'properly' though (ie. without introducing latency to do the analysis). (Also, it's "threshold" with an 'h'.) |
|
Me and @hugohutri did both try it on our machines and it seemed to work fine, we can do some more testing ofc. |
|
Typo is now fixed. I haven't noticed any problems with the start of a phrase, and it seems to capture it. |
|
Signed-off-by: Hugo Hutri hugo.hutri98@gmail.com |
|
Signed-off-by: Fabio Lenherr / DashieTM fabio.lenherr@gmail.com |
|
@hugohutri @DashieTM and everyone else who is contributing to this, Thank you! I'm Gaelle, member of the product design team here at Element. I have reviewed the above and my conclusions are the following:
Let me know if you'd like to sync in person as well - we can organise a call np |
Co-authored-by: Šimon Brandner <simon.bra.ag@gmail.com>
Co-authored-by: Šimon Brandner <simon.bra.ag@gmail.com>
Co-authored-by: Šimon Brandner <simon.bra.ag@gmail.com>
Co-authored-by: Šimon Brandner <simon.bra.ag@gmail.com>
Co-authored-by: Šimon Brandner <simon.bra.ag@gmail.com>
|
(re-request my review when ready) |
|
Alright thanks @gaelledel for the reply!
About syncing, I am open about whatever you would like to use. |
|
Related: #714 (Noise reduction) |
|
Hi @hugohutri and @DashieTM, really sorry to have let this PR languish. Unfortunately, the reality is that we don't have the product or design bandwidth to support a feature like this at the moment, given that it doesn't fit in with our roadmap items for the foreseeable future. To reflect this, I'm going to close the PR. Please don't hesitate to reach out to us again in the public WebRTC room if you want to discuss other ways to contribute to Element Call. I think there's still a lot of opportunities where we'd be happy to let the community get involved, but they're generally more aligned to a video call use case, rather than voice-first. |
|
@robintown I honestly don't understand this decision. Even if it doesn't fit with the roadmap, this PR is basically done and has been for quite some time. Why not just merge it so the users wanting/needing it are happy, and move on with your roadmap? |
|
I'm quite disappointed with the progress of this PR, especially considering it's the only one I've been following in this repository. I've been regularly checking for availability on this feature, but no update came. People contributed to this feature because they consider it important, and I share the same sentiment. Widely used applications like Discord have a similar feature to address background noise quickly. Video calls are in fact voice-first because you communicate via the voice, having a volume threshold is crucial for a smooth experience. The absence of this feature is the main reason I haven't been using element-call, and I know others who face the same issue. Additionally, I think this should be a straightforward feature accessible to all users, not just confined to an "advanced section". It's unfortunate to see that something seemingly simple, even with the support of the open-source community, can't be implemented due to not enough "bandwidth". |
|
Hi @Neotamandua @SplittyDev @fti7, first of all, I would also like to apologise for being so hesitant with this longstanding PR, which also showcases that we have not taken it easy not to merge it. In fact, we had a passionate discussion about it several times. Element Call was designed to be a slim lightweight VoIP app which just works also for non-expert users. Since this proposed voice gate would require a decent amount of technical knowledge for a non-discord non-power users we decided against it. Now coming to the actual functionality of this PR "Voice activity threshold / gate" and, if applicable, possilbe workarounds. At least on linux using the pipewire audio daemon the following projects provide you similar functionality:
PS: Talking about 'we': I want make you aware that this was a team decision and @robintown is only the messenger. |
|
I hope this is not considered spam, and I am sorry for the late response, but I was a bit busy this week. While I am of course, sad to hear that this feature will not be available in element call, I also can't say that this PR was maintained for long, we also stopped as we simply waited for a response. ( see the backend PR what I At the end I want to thank all the nice devs, especially Simon for the helpful reviews! Also quite fast! Some additional things: I wish you the best with this project, and I hope to see this in other matrix clients as well at some point, cheers. |
|
Hello @fkwp, I have to disagree with you and everyone involved in making this decision. Considering that Element Call was intended to be a slim and lightweight VoIP app, it is interesting to see support for screenshare, which undoubtedly is a valuable feature, but considerably more sophisticated than a simple volume threshold? Especially because it is designed for simplicity, a volume threshold serves as the simplest inbuilt solution, eliminating the need for complex ML noise reduction or other algorithms. This feature, being optional, serves to both technical and non-technical users. Non-power users who may not feel comfortable using it can simply choose not to, but depriving everyone of this option seems unreasonable. Considering that this app is designed for VoIP, adding a volume threshold is in line with providing an all-encompassing experience. I have personally encountered instances where meetings suffered from poor voice quality, and a volume threshold would have been immensely helpful in rectifying the issue. Even for non-technical users, colleagues in the meeting could easily guide them (even with screenshare) to set the volume threshold correctly, enhancing the overall voice experience for everyone involved, making the whole application more appealing. Coming back to discord again: It is a prime example of a mass adopted application with a user base also consisting of non-power users, which successfully implemented a volume gate. This feature has been a market standard, also adopted by successful predecessors like e.g., Teamspeak, making it a widely recognized and easily explainable tool for managing voice quality. I don't see how this needs a decent amount of technical knowledge in order to use it properly, especially given the previously mentioned fact, that other meeting attendees can help out. I kindly recommend you to reconsider this decision. |
|
To summarize, I understand that the main objections against this feature are:
Regarding point 1 @SplittyDev has raised an excellent point: Seeing how this PR is basically done, the bandwidth requirement should be minimal. Regarding point 2 and 3, I fully agree with @Neotamandua and I would like to add a few more thoughts: About point 2: I believe this argument is misplaced in this discussion, because the feature does not take away from the existing functionality or hamper it in any way. Non-experts can and likely will just ignore such a setting. About point 3: While I understand that you want to keep Element Call as lightweight as possible, I do believe that the benefits of this feature by far outweighs the added complexity, because it solves a big fraction of a more complex problem (see also #714) in a very simple way. In none of the major voice or video conferencing softwares that I am using (this includes not only Teamspeak and Discord, but also more business-focused applications like Slack, Teams and Zoom), I hear as much typing, mouse clicking and other background noise as in Element Call. I expect silence when nobody is talking, especially since this is a solved problem virtually everywhere else (and has been solved for more than 15 years, e.g. with TeamSpeak 2). With that, I would like to join @Neotamandua in politely asking you to reconsider this decision. |
|
I've been been following this PR since the beginning and I was pretty shocked to see it rejected. It's frustrating that Element is constantly being touted as an alternative to Discord and, while there is some truth in that, the things that make Discord Discord are constantly being ignored. It's getting to the point that whenever I see a social media post from Element or one of the Matrix people comparing Element to Discord, my brain subconsciously edits the message to "better than generic messenger" because Element is leagues behind Discord despite only needing to implement a small number of features to get their foot in. |
|
I agree completely with the above couple of comments and I cannot wrap my head around how implementing these basic features is not one of their top priorities. I am absolutely certain that the moment element had discord / teamspeak / mumble style voice-only rooms with voice activation they would instantly gain tens if not hundreds of thousands of users. Just search for "self-hosted discord alternative" and you will see that there is a huge demand. Instead, the focus seems to lie on becoming another competitor to slack or teams, which in my opinion is a lost cause, especially if they don't already have a large userbase that would advocate for element. The only explanation I can come up with is that it is easier to secure investments with a business focused target userbase. I think this can never succeed. Sorry for the off-topic rant. It is so frustrating to see a project with so much potential go absolutely nowhere due to narrow-minded strategic decisions. |

Adding voice activity threshold slider, since this is pretty important feature for voice calls, and similar feature can be found in Discord etc. If volume is below the threshold, the track will be muted.
Added features:
elem-vad.mp4
Requires:
matrix-org/matrix-js-sdk#2556