Skip to content

Conversation

@gustavonihei
Copy link
Contributor

Summary

This PR intends to provide an alternative solution to the fix provide on #4092 for ESP32-C3, and extend it to the ESP32 Wi-Fi driver.

Originally, the semaphores created by the Wi-Fi kernel thread for handling application connection requests were not being destroyed, causing the a memory leak of 32 bytes for every wapi invocation.

The ESP32-C3 solution is being refactored in order to avoid the usage of on_exit in driver code, see: #6197

Impact

Fix for semaphore leak on ESP32 and refactor of current solution on ESP32-C3.

Testing

esp32-devkitc:wapi
esp32c3-devkit:wapi

This reverts commit f5eaf82.

Signed-off-by: Gustavo Henrique Nihei <gustavo.nihei@espressif.com>
Signed-off-by: Gustavo Henrique Nihei <gustavo.nihei@espressif.com>
Signed-off-by: Gustavo Henrique Nihei <gustavo.nihei@espressif.com>
Copy link
Contributor

@pkarashchenko pkarashchenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@pkarashchenko
Copy link
Contributor

One question from my side. Do we expect multi-instance of wifi driver? If not, then maybe we can just have a global semaphore pointer and use that on/atexit?
I mean I do not fully understand neither pthread_key nor task_tls_ usage here.
Maybe it is reasonable to have static pointer to semaphore and if it is NULL, the allocate and init it. Then at/onexit the static pointer can be checked again and free the memory. Am I missing something?

@pkarashchenko pkarashchenko self-requested a review May 24, 2022 19:36
@gustavonihei
Copy link
Contributor Author

gustavonihei commented May 24, 2022

One question from my side. Do we expect multi-instance of wifi driver? If not, then maybe we can just have a global semaphore pointer and use that on/atexit?

The Wi-Fi library creates a new semaphore for every thread that performs connection operations, so we cannot have a global pointer.
Since these semaphores are thread-local, this motivated the initial implementation based on pthread_key_t, so that the semaphores were being stored in Thread Local Storage, and then could be destroyed on thread termination. The solution here was based on the one implemented for ESP-IDF, which works as expected.

But on NuttX it resulted in the semaphores not being destroyed. I'll try to explain why.
The Wi-Fi library operates in a dedicated Kernel Thread, named wifi. But the pthread_key_t and the destructor for the semaphores were allocated to the Thread Local Storage of the init thread.
Every network-related request from the application will be handled by the wifi kernel thread and its child pthreads. The issue is that those child pthreads do not belong to the same Task Group from the init thread, which is the one whose TLS area contains the semaphore destructor.

So the catch here is that NuttX provides this process-like abstraction which segregates pthreads created from different tasks. So a pthread created from Task B won't be able to share keys of type pthread_key_t with another pthread from Task A.

@gustavonihei
Copy link
Contributor Author

Do we expect multi-instance of wifi driver?

No, the Wi-Fi driver does not expect to be initialized more than once.

@masayuki2009 masayuki2009 merged commit b4392f7 into apache:master May 25, 2022
@gustavonihei gustavonihei deleted the bugfix/esp_wifi_semaphore branch May 25, 2022 01:11
@yamt
Copy link
Contributor

yamt commented May 25, 2022

my config had TLS_TASK_NELEM=0 and i got a bit surprised by this change breaking the build.
does Kconfig have a way to solve this kind of things in a bit more user friendly way?

@xiaoxiang781216
Copy link
Contributor

Task tls is disabled by default.:(

@pkarashchenko
Copy link
Contributor

One question from my side. Do we expect multi-instance of wifi driver? If not, then maybe we can just have a global semaphore pointer and use that on/atexit?

The Wi-Fi library creates a new semaphore for every thread that performs connection operations, so we cannot have a global pointer. Since these semaphores are thread-local, this motivated the initial implementation based on pthread_key_t, so that the semaphores were being stored in Thread Local Storage, and then could be destroyed on thread termination. The solution here was based on the one implemented for ESP-IDF, which works as expected.

But on NuttX it resulted in the semaphores not being destroyed. I'll try to explain why. The Wi-Fi library operates in a dedicated Kernel Thread, named wifi. But the pthread_key_t and the destructor for the semaphores were allocated to the Thread Local Storage of the init thread. Every network-related request from the application will be handled by the wifi kernel thread and its child pthreads. The issue is that those child pthreads do not belong to the same Task Group from the init thread, which is the one whose TLS area contains the semaphore destructor.

So the catch here is that NuttX provides this process-like abstraction which segregates pthreads created from different tasks. So a pthread created from Task B won't be able to share keys of type pthread_key_t with another pthread from Task A.

Yes. I just reexamined the task tls code and see that I did a wrong assumption.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants