First up, thank you for your work and the results from BERTopic topic modelling works as I expected with this vectorizer. However I am running into out of memory issues in both Google Colab and Kaggle on my custom dataset, about 7500 documents. I do not have any paid subscription to Google Cloud Platform or Colab Pro so it is not an option to run my codes using those. Is there any tricks or tips to optimise this vectorizer on large datasets? Thank you
First up, thank you for your work and the results from BERTopic topic modelling works as I expected with this vectorizer. However I am running into out of memory issues in both Google Colab and Kaggle on my custom dataset, about 7500 documents. I do not have any paid subscription to Google Cloud Platform or Colab Pro so it is not an option to run my codes using those. Is there any tricks or tips to optimise this vectorizer on large datasets? Thank you