Seq2Seq Scale AI Assignment

About Me

Name: Samuel Sommerer

Email: sommerer@usc.edu

Description

The first thing that popped into mind when trying to solve this problem was to use a transformer. They're the gold-standard for seq2seq tasks. Subsequently, I found an existing solution for this problem on GitHub that also uses a transformer. You can find that repo here: https://github.com/jaymody/seq2seq-polynomial. This repo already had a transformer implemented that achieved good test accuracy and was under 5 million trainable parameters, so I decided to use this repo as a starting point and forked it.

I made several modifications to the existing transformer. First, I introduced label smoothing to when calculating cross entropy loss to combat overfitting. This ended up not being really necessary as I wasn't training for enough epochs for overfitting to be a problem (Google Colab limited my GPU usage). Second, I manually implemented a ReZero encoder layer for the transformer. The idea was to decrease model convergence time and cut down on training time. I got ideas for how to implement the ReZero encoder layer here: https://github.com/tbachlechner/ReZero-examples.

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
.idea		.idea
data		data
models/best		models/best
.gitignore		.gitignore
README.md		README.md
data.py		data.py
evaluate.py		evaluate.py
expand.md		expand.md
layers.py		layers.py
loss.png		loss.png
main.py		main.py
network.txt		network.txt
requirements.txt		requirements.txt
short.txt		short.txt
train.py		train.py
train.txt		train.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Seq2Seq Scale AI Assignment

About Me

Description

About

Uh oh!

Releases

Packages

Languages

sam-sommerer/seq2seq-polynomial

Folders and files

Latest commit

History

Repository files navigation

Seq2Seq Scale AI Assignment

About Me

Description

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages