Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
624672d
generate updated
feihugis Sep 7, 2021
d677d7a
cleanup
feihugis Sep 8, 2021
408fefa
generate update
feihugis Sep 8, 2021
bb53938
debug and cleanup
JulianneKnott Sep 9, 2021
3c095d0
more cleanup
JulianneKnott Sep 9, 2021
09667b8
cleanup
JulianneKnott Sep 10, 2021
315c165
cleanup
JulianneKnott Sep 10, 2021
121bb4c
fixed benchmark script
JulianneKnott Sep 10, 2021
6e3b2d2
beam search optimizer updates
JulianneKnott Sep 14, 2021
3e69d18
el attention update
JulianneKnott Sep 24, 2021
e4fcf96
bug fix
JulianneKnott Sep 27, 2021
22ebfb0
debug fairseq tests (w/ beam search opt) + cleanup
JulianneKnott Sep 29, 2021
c4a87f5
clean
JulianneKnott Sep 29, 2021
ce7557d
cleanup
JulianneKnott Sep 29, 2021
7f092bd
fairseq tests (w/ el attn) + cleanup
JulianneKnott Sep 29, 2021
e93f7e0
add cli test for fairseq
JulianneKnott Oct 1, 2021
1ffd7a2
bug fix
JulianneKnott Oct 4, 2021
32f01e6
speed improvements
JulianneKnott Oct 15, 2021
d8240ae
prophetnet updates
JulianneKnott Oct 19, 2021
c25515a
cleanup
JulianneKnott Oct 19, 2021
3034a31
fairseq version docker
JulianneKnott Oct 19, 2021
424a332
cleanup
JulianneKnott Oct 20, 2021
b2d6298
cleanup
JulianneKnott Oct 20, 2021
3540fa8
cleanup
JulianneKnott Oct 20, 2021
f454522
pipeline debug
JulianneKnott Oct 20, 2021
9ba9a98
pipeline debug
JulianneKnott Oct 20, 2021
59949b3
debug pipeline
JulianneKnott Oct 20, 2021
f3feffc
unit test clean up
JulianneKnott Oct 23, 2021
f692e11
debug
JulianneKnott Oct 25, 2021
badc5d9
debug
JulianneKnott Oct 25, 2021
269dd9f
try sudo uninstall
JulianneKnott Oct 25, 2021
1c4758d
some benchmark updates
JulianneKnott Oct 25, 2021
78871b5
explicit add to path in pipeline
JulianneKnott Oct 25, 2021
62f2bf2
debug pipeline
JulianneKnott Oct 25, 2021
38c2f25
fix typo in debug pipeline
JulianneKnott Oct 25, 2021
04bdffc
debug pipeline - path update
JulianneKnott Oct 25, 2021
556d868
debug pipeline
JulianneKnott Oct 25, 2021
dcd606e
debug pipeline fs test env
JulianneKnott Oct 25, 2021
c15cc0b
debug pipeline fs test env setup
JulianneKnott Oct 25, 2021
764d97c
more debug
JulianneKnott Oct 25, 2021
7fac49a
debug
JulianneKnott Oct 26, 2021
cf4165a
debug
JulianneKnott Oct 26, 2021
8f2ab96
debug
JulianneKnott Oct 26, 2021
c3d8e94
debug
JulianneKnott Oct 26, 2021
39c1d62
debug
JulianneKnott Oct 26, 2021
75df5ac
debug + benchmark updates
JulianneKnott Oct 26, 2021
4976451
cleanup
JulianneKnott Oct 26, 2021
3e0de2f
cleanup
JulianneKnott Oct 26, 2021
1dbbab1
debug: skip fastseq + transformers tests
JulianneKnott Oct 26, 2021
60cb8e4
debug pipeline
JulianneKnott Oct 26, 2021
49259da
debug pipeline
JulianneKnott Oct 26, 2021
7e7c9a3
debug pipeline
JulianneKnott Oct 26, 2021
3d53648
debug
JulianneKnott Oct 26, 2021
c83cc75
debug
JulianneKnott Oct 26, 2021
915a6dc
debug
JulianneKnott Oct 26, 2021
2ad2aca
debug - beamsearch only
JulianneKnott Oct 26, 2021
2aa6ce8
debug
JulianneKnott Oct 26, 2021
1873cbf
debug - switch order
JulianneKnott Oct 26, 2021
e698946
debug - move prep env
JulianneKnott Oct 26, 2021
fa0495b
debug
JulianneKnott Oct 26, 2021
f275b6e
debug
JulianneKnott Oct 26, 2021
0dca80f
skip problematic fairseq tests
JulianneKnott Oct 26, 2021
87fe626
debug - change pipmain flag
JulianneKnott Oct 26, 2021
276978d
debug - fix typo
JulianneKnott Oct 26, 2021
4492d8c
debug
JulianneKnott Oct 26, 2021
c6060e7
debug - hydracore
JulianneKnott Oct 27, 2021
2e9715a
debug pipeline - fix pipmain install from local
JulianneKnott Oct 27, 2021
13a3c32
pip debug try add to path
JulianneKnott Oct 27, 2021
b9a850e
debug pipeline more changes to fairseq install
JulianneKnott Oct 27, 2021
79b548e
check if path update change is necessary
JulianneKnott Oct 27, 2021
93536b5
cleanup and run all tests
JulianneKnott Oct 27, 2021
190a340
fix newline at end of beam search
JulianneKnott Oct 27, 2021
1801a82
fix formatting el attn
JulianneKnott Oct 27, 2021
33c401c
cleanup
JulianneKnott Oct 27, 2021
5c74cb2
removed debugging code
JulianneKnott Oct 27, 2021
71abfc3
comment for parent + child classes sharing name
JulianneKnott Oct 27, 2021
b3f722d
update fastseq version
JulianneKnott Oct 27, 2021
cc5446f
fix readme
JulianneKnott Oct 27, 2021
4a3f17e
cleanup
JulianneKnott Oct 27, 2021
4747b23
fix readme
JulianneKnott Oct 27, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,17 +16,17 @@ Below shows the generation speed gain by using FastSeq.

| Model | W/O FastSeq (in samples/s) | W/ FastSeq (in samples/s) | Speedup |
|------------------|:--------------------------:|:-------------------------:|:-----:|
| [ProphetNet](examples/prophetnet/README.md) | 2.8 | 10.7 | 3.8x |
| [Bart (`fs`)](examples/bart/README.md) | 2.4 | 25.3 | 10.5x |
| [ProphetNet](examples/prophetnet/README.md) | 2.8 | 11.9 | 4.3 |
| [Bart (`fs`)](examples/bart/README.md) | 3.3 | 25.1 | 7.7x |
| [Bart (`hf`)](examples/bart/README.md#speedup-bart-huggingface-transformers-version-by-using-fastseq) | 2.5 | 12.4 | 5.0x |
| [DistilBart (`hf`)](examples/distilbart/README.md) | 3.4 | 18.5 | 5.4x |
| [T5 (`hf`)](examples/t5/README.md) | 8.7 | 31.3 | 3.6x |
| [WMT16 En-De (`fs`)](examples/wmt/README.md) | 96.0 | 417.0 | 4.3x |
| [WMT16 En-De (`fs`)](examples/wmt/README.md) | 144.5 | 422.8 | 2.9x |
| [GPT2 (`hf`)](examples/gpt2/README.md) | 3.0 | 16.7 | 5.5x |
| [UniLM (`hf`)](examples/unilm/README.md) | 1.7 | 16.4 | 9.6x |

- All benchmarking experiments run on NVIDIA-V100-16GB with [docker](docker/Dockerfile). Highest speed recorded for each model by tuning batch size. For parameter setting details, click link of corresponding model.
- `fs` stands for [Fairseq](https://github.com/pytorch/fairseq) 0.9.0 version, `hf` stands for [Huggingface Transformers](https://github.com/huggingface/transformers) 3.0.2 version.
- `fs` stands for [Fairseq](https://github.com/pytorch/fairseq) 0.10.2 version, `hf` stands for [Huggingface Transformers](https://github.com/huggingface/transformers) 3.0.2 version.
- Optimizations were automatically applied to all generation/sequence models in Fairseq & Huggingface Transformers. Above only lists a subset of them.

## How it works?
Expand Down
20 changes: 18 additions & 2 deletions azure-pipelines.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,29 @@ jobs:
demands:
- agent.name -equals gpu3
container:
image: adsbrainwestus2.azurecr.io/fastseq:dev-py3
image: adsbrainwestus2.azurecr.io/fastseq:dev-py3
endpoint: fastseq-acr
options: --gpus device=3
steps:
- script: |
#install fastseq
pip install --editable .[transformers,fairseq]
which pip
which python

echo "******* Installing fairseq *******"
pip install fairseq==0.10.2
pip show fairseq

echo "******* Installing transformers *******"
pip install transformers
pip show transformers

echo "******* Installing fastseq *******"
pip install --editable .
pip show fastseq

echo "******* Adding local bin to path *******"
export PATH="$HOME/bin:$HOME/.local/bin:$PATH"

echo "******* Running fastseq unittests *******"
pip install pytorch-transformers==1.0.0
Expand Down
4 changes: 3 additions & 1 deletion benchmarks/benchmark_fs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,7 @@ for bs in "${bs_list[@]}"; do
--no-repeat-ngram-size 3 \
--lenpen 2.0 \
--use-el-attn \
--required-seq-len-multiple 8 \
Comment thread
feihugis marked this conversation as resolved.
`#--print-alignment` \
`#--print-step # KeyError: steps` \
--skip-invalid-size-inputs-valid-test $* \
Expand All @@ -132,6 +133,7 @@ for bs in "${bs_list[@]}"; do
--max-len-b 140 \
--no-repeat-ngram-size 3 \
--lenpen 2.0 \
--required-seq-len-multiple 8 \
`#--print-alignment` \
`#--print-step # KeyError: steps` \
--skip-invalid-size-inputs-valid-test $* \
Expand All @@ -140,7 +142,7 @@ for bs in "${bs_list[@]}"; do
ret=$?
end=`date +%s`
runtime=$(($end-$start))
tail=`tail -2 $STDOUT_FILE`
tail=`tail -3 $STDOUT_FILE`
if [[ $ret -eq 0 && $tail == *$mark1* ]]; then
samples=`echo $tail | sed 's/.*Translated \([0-9]*\) sentences.*/\1/'`
tokens=`echo $tail | sed 's/.*Translated .* sentences (\([0-9]*\) tokens).*/\1/'`
Expand Down
16 changes: 8 additions & 8 deletions benchmarks/models/fs_bart.sh
Original file line number Diff line number Diff line change
Expand Up @@ -34,21 +34,21 @@ grep "bart.large.cnn cnn_dm/len-1024.bin valid " perf \
| awk '{if($8!="NA"){c+=1;s+=$8}}END{print s/c}' \
| ./range.sh 17.9 18
# Speed on V100 16GB 250W
grep -E "fairseq_v0.9.0 bart.large.cnn cnn_dm/len-1024.bin valid 32 " perf \
grep -E "fairseq_v0.10.2 bart.large.cnn cnn_dm/len-1024.bin valid 32 " perf \
| awk '{s+=$13}END{if(NR==0) print -1; else print s/NR}' \
| ./range.sh 2.1 2.7
grep -E "fairseq_v0.9.0\+fastseq_v.* bart.large.cnn cnn_dm/len-1024.bin valid 32 " perf \
| ./range.sh 3.1 3.7
grep -E "fairseq_v0.10.2\+fastseq_v.* bart.large.cnn cnn_dm/len-1024.bin valid 32 " perf \
| awk '{s+=$13}END{print s/NR}' \
| ./range.sh 7.8 100
grep -E "fairseq_v0.9.0\+fastseq_v.* bart.large.cnn cnn_dm/len-1024.bin valid 64 " perf \
grep -E "fairseq_v0.10.2\+fastseq_v.* bart.large.cnn cnn_dm/len-1024.bin valid 64 " perf \
| awk '{s+=$13}END{print s/NR}' \
| ./range.sh 13.0 100
grep -E "fairseq_v0.9.0\+fastseq_v.* bart.large.cnn cnn_dm/len-1024.bin valid 128 " perf \
grep -E "fairseq_v0.10.2\+fastseq_v.* bart.large.cnn cnn_dm/len-1024.bin valid 128 " perf \
| awk '{s+=$13}END{print s/NR}' \
| ./range.sh 18.1 100
grep -E "fairseq_v0.9.0\+fastseq_v.* bart.large.cnn cnn_dm/len-1024.bin valid 256 " perf \
grep -E "fairseq_v0.10.2\+fastseq_v.* bart.large.cnn cnn_dm/len-1024.bin valid 256 " perf \
| awk '{s+=$13}END{print s/NR}' \
| ./range.sh 19 100
grep -E "fairseq_v0.9.0\+fastseq_v.* bart.large.cnn cnn_dm/len-1024.bin valid 320 " perf \
grep -E "fairseq_v0.10.2\+fastseq_v.* bart.large.cnn cnn_dm/len-1024.bin valid 320 " perf \
| awk '{s+=$13}END{print s/NR}' \
| ./range.sh 25 100
| ./range.sh 24.5 100
46 changes: 24 additions & 22 deletions benchmarks/models/fs_prophetnet.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,19 +8,20 @@
# <batch-sizes>
source utils.sh

# TODO: update when ProphetNet is compatible with fairseq 0.10.2
Comment thread
feihugis marked this conversation as resolved.
# Download ProphetNet repo as the baseline if it does not exist
prophetnet_repo_path=$CACHE_DIR/ProphetNet
git_clone_if_not_in_cache \
https://github.com/microsoft/ProphetNet.git \
$prophetnet_repo_path

./benchmark.sh \
fairseq \
prophetnet_large_160G_cnndm_model \
cnn_dm_bert/len-512.bin \
valid \
64 \
--user-dir $prophetnet_repo_path/src/prophetnet/
# prophetnet_repo_path=$CACHE_DIR/ProphetNet
# git_clone_if_not_in_cache \
# https://github.com/microsoft/ProphetNet.git \
# $prophetnet_repo_path
#
# ./benchmark.sh \
# fairseq \
# prophetnet_large_160G_cnndm_model \
# cnn_dm_bert/len-512.bin \
# valid \
# 64 \
# --user-dir $prophetnet_repo_path/src/prophetnet/
./benchmark.sh \
fairseq+fastseq \
prophetnet_large_160G_cnndm_model \
Expand All @@ -33,18 +34,19 @@ grep "prophetnet_large_160G_cnndm_model cnn_dm_bert/len-512.bin valid" perf \
| awk '{if($8!="NA"){c+=1;s+=$8}}END{print s/c}' \
| ./range.sh 19.1 19.2
# # Speed on V100 16GB 250W
grep -E "fairseq_v0.9.0 prophetnet_large_160G_cnndm_model cnn_dm_bert/len-512.bin valid 32 " perf \
| awk '{s+=$13}END{if(NR==0) print -1; else print s/NR}' \
| ./range.sh 2 3
grep -E "fairseq_v0.9.0 prophetnet_large_160G_cnndm_model cnn_dm_bert/len-512.bin valid 64 " perf \
| awk '{s+=$13}END{if(NR==0) print -1; else print s/NR}' \
| ./range.sh 2 3
grep -E "fairseq_v0.9.0\+fastseq_v.* prophetnet_large_160G_cnndm_model cnn_dm_bert/len-512.bin valid 32 " perf \
# TODO: update when ProphetNet is compatible with fairseq 0.10.2
# grep -E "fairseq_v0.10.2 prophetnet_large_160G_cnndm_model cnn_dm_bert/len-512.bin valid 32 " perf \
# | awk '{s+=$13}END{if(NR==0) print -1; else print s/NR}' \
# | ./range.sh 2 3
# grep -E "fairseq_v0.10.2 prophetnet_large_160G_cnndm_model cnn_dm_bert/len-512.bin valid 64 " perf \
# | awk '{s+=$13}END{if(NR==0) print -1; else print s/NR}' \
# | ./range.sh 2 3
grep -E "fairseq_v0.10.2\+fastseq_v.* prophetnet_large_160G_cnndm_model cnn_dm_bert/len-512.bin valid 32 " perf \
| awk '{s+=$13}END{if(NR==0) print -1; else print s/NR}' \
| ./range.sh 5.7 6.5
grep -E "fairseq_v0.9.0\+fastseq_v.* prophetnet_large_160G_cnndm_model cnn_dm_bert/len-512.bin valid 64 " perf \
grep -E "fairseq_v0.10.2\+fastseq_v.* prophetnet_large_160G_cnndm_model cnn_dm_bert/len-512.bin valid 64 " perf \
| awk '{s+=$13}END{print s/NR}' \
| ./range.sh 7.5 10
grep -E "fairseq_v0.9.0\+fastseq_v.* prophetnet_large_160G_cnndm_model cnn_dm_bert/len-512.bin valid 128 " perf \
| ./range.sh 8 10.5
grep -E "fairseq_v0.10.2\+fastseq_v.* prophetnet_large_160G_cnndm_model cnn_dm_bert/len-512.bin valid 128 " perf \
| awk '{s+=$13}END{print s/NR}'\
| ./range.sh 10 15
10 changes: 5 additions & 5 deletions benchmarks/models/fs_wmt.sh
Original file line number Diff line number Diff line change
Expand Up @@ -27,15 +27,15 @@ grep " wmt16.en.de.32k wmt16_en_de_bpe32k/bin valid " perf \
| awk '{if($8!="NA"){c+=1;s+=$8}}END{print s/c}' \
| ./range.sh 0.05 0.07
# Speed on V100 16GB 250W
grep -E "fairseq_v0.9.0 wmt16.en.de.32k wmt16_en_de_bpe32k/bin valid 256 " perf \
grep -E "fairseq_v0.10.2 wmt16.en.de.32k wmt16_en_de_bpe32k/bin valid 256 " perf \
| awk '{s+=$13}END{if(NR==0) print -1; else print s/NR}' \
| ./range.sh 93 100
grep -E "fairseq_v0.9.0\+fastseq_v.* wmt16.en.de.32k wmt16_en_de_bpe32k/bin valid 256 " perf \
| ./range.sh 100 150
grep -E "fairseq_v0.10.2\+fastseq_v.* wmt16.en.de.32k wmt16_en_de_bpe32k/bin valid 256 " perf \
| awk '{s+=$13}END{print s/NR}' \
| ./range.sh 350 1000
grep -E "fairseq_v0.9.0\+fastseq_v.* wmt16.en.de.32k wmt16_en_de_bpe32k/bin valid 512 " perf \
grep -E "fairseq_v0.10.2\+fastseq_v.* wmt16.en.de.32k wmt16_en_de_bpe32k/bin valid 512 " perf \
| awk '{s+=$13}END{print s/NR}' \
| ./range.sh 390 1000
grep -E "fairseq_v0.9.0\+fastseq_v.* wmt16.en.de.32k wmt16_en_de_bpe32k/bin valid 1024 " perf \
grep -E "fairseq_v0.10.2\+fastseq_v.* wmt16.en.de.32k wmt16_en_de_bpe32k/bin valid 1024 " perf \
| awk '{s+=$13}END{print s/NR}' \
| ./range.sh 405 1000
2 changes: 1 addition & 1 deletion docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ RUN pip install --upgrade pip && \
pip install requests>=v2.24.0 && \
pip install gitpython>=v3.1.7 && \
pip install rouge_score==v0.0.4 && \
pip install fairseq==v0.9.0 && \
pip install fairseq==v0.10.2 && \
pip install transformers==v3.0.2 && \
pip install pytorch-transformers==1.0.0

Expand Down
Loading