Request for baseline code for GPT-4+LLaVA

Hi, thanks for the awesome work! 

I saw in Table 2 of the paper there is an important baseline namely "Socratic LLMs w/ Frame Captions (GPT-4 w/ LLaVA-1.5)". I am wondering whether you could also consider providing the implementation for this baseline, especially the LLaVA captions and GPT-4 prompts. 

Any help would be highly appreciated!