add SEA-LION support#6448
Conversation
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
| def set_vocab(self): | ||
| try: | ||
| self._set_vocab_gpt2() | ||
| except: |
There was a problem hiding this comment.
@bryanSwk Why would _set_vocab_gpt2() ever fail, and why did you add this except? Trying to understand what the except clause is doing here, and if we should have it, or qualify it a bit more. This except being open-ended is breaking the CI linter. We can fix this by changing it to except Exception:, but I'd prefer to understand why this branch is here.
There was a problem hiding this comment.
Hi @HanClinto, my intention for this try-except is to differentiate between the sealion variant of mpt, which utilises spm tokenizer.
There isn't a differentiating field in the config.json for sealion 7b as it also uses the MPTForCausalLM class. Hence, this is just a fallback for the sealion model.
There was a problem hiding this comment.
Very helpful, thank you!
Do you happen to know what kind of exception it will throw in that instance? If not, I can just change it to except Exception: and it should still pass lint
There was a problem hiding this comment.
yes, except Exception: will work, you can go ahead to make the change.
thank you!
There was a problem hiding this comment.
Sounds good! Feel free to make any suggestions or wording changes on this PR:
#6470
I'm not very familiar with SEA-LION, so I welcome any adjustments you may have that would make things more clear. :)
* initial commit for sealion support * add sealion support * minor fix * q/k ln and pos_embd only if required * Apply suggestions from code review Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * minor : clear whitespaces --------- Co-authored-by: bryan <bryansiow@aisingapore.org> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* initial commit for sealion support * add sealion support * minor fix * q/k ln and pos_embd only if required * Apply suggestions from code review Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * minor : clear whitespaces --------- Co-authored-by: bryan <bryansiow@aisingapore.org> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>




This PR intends to add support for SEA-LION models, which is based on the MPT architecture with added
bias,pos_embdandqk_lnlayers.This PR builds upon @datquocnguyen's PR with modifications by adding optional
pos_embdandqk_lnlayers.Sanity checks have been done on SEA-LION 7B Instruct and MPT 7B Instruct.