Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:
timeout-minutes: 10
strategy:
matrix:
go-version: [1.16.x, 1.17.x, 1.18.x, 1.19.x]
go-version: [1.16.x, 1.17.x, 1.18.x, 1.19.x, 1.20.x]
platform: [ubuntu-latest, macos-latest, windows-latest]
runs-on: ${{ matrix.platform }}

Expand Down
9 changes: 5 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
LINTER_VERSION=v1.50.1
LINTER_VERSION=v1.53.3
LINTER=./bin/golangci-lint
ifeq ($(OS),Windows_NT)
LINTER=./bin/golangci-lint.exe
endif
pkgs=$(shell go list ./... | grep -v /cmd/)

.PHONY: all
all: clean setup lint test ## Run sequentially clean, setup, lint and test.
Expand All @@ -25,15 +26,15 @@ setup: ## Download dependencies.

.PHONY: test
test: ## Run tests (with race condition detection).
go test -race -timeout=10m $(go list ./... | grep -v /cmd/)
go test -race -timeout=10m $(pkgs)

.PHONY: bench
bench: ## Run benchmarks.
go test -race -timeout=15m -benchmem -benchtime=2x -bench $(go list ./... | grep -v /cmd/)
go test -race -timeout=15m -benchmem -benchtime=2x -bench .

.PHONY: cover
cover: ## Run tests with coverage. Generates "cover.out" profile and its html representation.
go test -race -timeout=10m -coverprofile=cover.out -coverpkg=./... $(go list ./... | grep -v /cmd/)
go test -race -timeout=10m -coverprofile=cover.out -coverpkg=./... $(pkgs)
go tool cover -html=cover.out -o cover.html

.PHONY: tidy
Expand Down
12 changes: 7 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,21 +30,23 @@ Please refer to this [example](https://pkg.go.dev/github.com/actforgood/bigcsvre

### Benchmarks
```
go test -timeout=20m -benchmem -benchtime=2x -bench=.
go test -race -timeout=15m -benchmem -benchtime=2x -bench .
goos: darwin
goarch: amd64
pkg: github.com/actforgood/bigcsvreader
cpu: Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
Benchmark50000Rows_50Mb_withBigCsvReader-8 2 8030321166 ns/op 61739968 B/op 100219 allocs/op
Benchmark50000Rows_50Mb_withGoCsvReaderReadAll-8 2 65555449418 ns/op 67438460 B/op 100040 allocs/op
Benchmark50000Rows_50Mb_withGoCsvReaderReadOneByOneAndReuseRecord-8 2 66464272707 ns/op 57605856 B/op 50014 allocs/op
Benchmark50000Rows_50Mb_withBigCsvReader-8 2 8101972370 ns/op 61740600 B/op 100267 allocs/op
Benchmark50000Rows_50Mb_withGoCsvReaderReadAll-8 2 67070393110 ns/op 68507768 B/op 100043 allocs/op
Benchmark50000Rows_50Mb_withGoCsvReaderReadOneByOneAndReuseRecord-8 2 69045793069 ns/op 57606112 B/op 50018 allocs/op
Benchmark50000Rows_50Mb_withGoCsvReaderReadOneByOneProcessParalell-8 2 8286623971 ns/op 61607272 B/op 100037 allocs/op
```

Benchmarks are made with a file of ~`50Mb` in size, also a fake processing of any given row of `1ms` was taken into consideration.
bigcsvreader was launched with `8` goroutines.
Other benchmarks are made using directly the `encoding/csv` go package.
As you can see, bigcsvreader reads and processes all rows in ~`8s`.
Go standard csv package reads and processes all rows in ~`65s`.
Go standard csv package reads and processes all rows in ~`67s` (sequentially).
Go standard csv package read and a parallel processing of rows timing is comparable to the one of bigcsvreader (so this strategy is a good alternative to this package).
`ReadAll` API has the disadvantage of keeping all rows into memory.
`Read` rows one by one API with `ReuseRecord` flag set has the advantage of fewer allocations, but has the cost of sequentially reading rows.
> Note: It's a coincidence that parallelized version timing was ~equal to sequential timing divided by no of started goroutines. You should not take this as a rule.
Expand Down
2 changes: 1 addition & 1 deletion docs/how-it-works.drawio
Original file line number Diff line number Diff line change
@@ -1 +1 @@
<mxfile host="app.diagrams.net" modified="2022-09-15T09:49:21.643Z" agent="5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36" etag="slslIIGAjnM_NkKcplH7" version="20.3.0" type="device"><diagram id="na7FgJPB3qsZf9mXNy79" name="Page-1">7V1bd6M4Ev41PmfnIT6AuD7m0smcs707me45s92PBLBNN0ZewInTv34ESBhUspEdwHaC8xAQIHCV6qv6SiU8QbfLzUPirhb/wX4QTTTF30zQ3UTTVMU2yL+85bVs0S2nbJgnoU9P2jZ8DX8F7Eraug79IG2cmGEcZeGq2ejhOA68rNHmJgl+aZ42w1Hzrit3HoCGr54bwdb/hX62KFttQ9m2/x6E8wW7s6rQI0uXnUwb0oXr45daE/o0QbcJxlm5tdzcBlEuPCaX8rr7HUerB0uCOJO54E/7248/vsV/L348zza/Hr4sfmbfrwzazbMbrek3vgnnt+nzl8D1g4Q+efbKxEG+xCrfXC+j+8Rdks2bl0WYBV9Xrpe3v5BRQNoW2TIieyrZ9N10Efh0Z4bjjOpY1fNrqUhV8uzkKiZIlO/Nwii6xRFOijuj++JD2uH3Zt8hSLJgU2uicngI8DLIkldyCj3KVELHJFPmy1bBlk3bFjXlGmxQunRQzauet3InG1T0h6jBAGq4Xq2i0HOzEMfvVAu20qqGCgaGUQM0holmRlkuv/C5oQLz/+vcbm8aW/P8P4Eccu8FNZ/yatJQdFCecWSfY1djV2NXJ+uKXfGU8C18zxxYtyAzw/LlZp5HUNMnNw29qY+99bKA1hs3CucxOcEjuwRS0E3kPgXRI07DwjfUDuTQS1xG9Jk7YRn6fv401QnXtMvqAIfys9lM87z84bIE/wxqR3zzyTTMbvDfasK/qkP8r9rq+K/1hv+WAP85bUZhHFSCYRGhyokwxsVJTHNRMMv2ST8lgyOM539hMjTurtRty+fiwju0bfnCfHMxuDM3c5+KJ1ME4yIpz71Z4TDOClEZNxPjrmhJslscky/hhoX6AjfNXoI04yOFDrSsWw0tI1WFWhYFW3ZPOraBjonmgqmXPgNdk2+YccbasAeq5j2aB6YJdC9ChwSvY7/QgSKI1op9+pBaBwpSFU5DLBitacgUKAj1ZYWmBTQR+ISU0V0ydhd4jmM3+rRt5WS2PeczLqwqV96PIMteqSTddYabqiUCTF6/0euLne/5ztRgu3eb+sG7V4F6NLpfA0yl+OwzpRSvEy/YF5VSQWduMg+yffBFe8yltVfzSRARYvHcpLgiPdJLH3MEqZm0zSG3qU9tzl7Lh6UXcgOiepI3ROoKMOMHTMZAlsOzpqhwBEVRuEolmBIwt6Zr9I3A9nWRa7S1J2R25BqRyftGs5JwzSwdkVmivsiRiB3xUo796zzpkkNf5KYkkmkKtx3ZRKYTbMKsskyyXTNMsre1y3yHmeXWnKfqYQZd7D0GSUjElmP33UDWq5/IejXDnlpGs5fyWwHrBX0hg0cCA/TVNxJoYFjmuSuhCykC43wYBmn4qxY51UaovOeWHb1w6Ow1L5rIpE83qXJVEuAhPTDeFkA5l+qeUS/uWZM1cFvv2sDfpEdtvwsVgPtlu1AHVdB0KheqjS60VwtjXHNoF4oU1JkLJXxsaBfKRuGlu1DtUlxoNUk4rA8dwBciSUt1jPPyhWivL/zvO/OFSLWmp3aF5kGukObVGslJbnrz0c0I3MRFi6YgkKTVjnaeOzXQOnpr0jVEM5m07a3uSOHdEZeak/VFfEeaznXUtyNyoCPKKzjA1MvtwiUQHwkzPU0dH2qezSFQfgQTJEb+JzJbs/h0Y7YmmCGBKSBLaLRTY/cYepPZIgmzZTNa3jqJXm8S1/uZu4I2TWzVVuiFwOvvcjpqN9MDCkM0frYCwqQmknhfsxW6BGN4R/IWzQGK5K2bIEbuTuQwFgAiH2cBD1Q0T3IsqOdBZwF1fVRy31O9xIEJosxh1Qzr6kY1d6xm1UYnVrJE2cZ78pG2I5oLFLpJmErqTuowVgdSH3OZx2dIWKTddy4TcSSDDKSpdmQuUzPsZl+qMzX5znrmkAac1rjIZCazr7NPZhowlHpMsBekDd4uqLHMBXSVFhK7zpm8ttpMRJWXZQpgJ8tXD2P5+2p0eix2dxSjieK6LpqOEvlNPgnTneYOy8GdC4QfCODHYzUb2a1YzfJWfWO1yS1cUS005VMPslht8/OjJpqqXLa3b6iGgdvjOr/swqCa2dH5QzUsfh2huh2qxQH3sFB9mdH2cFBtS0I1I6tDQ3URVpvdQDVYANEzTpswpL5MnL6UkNqEWf8Rp1txWjRZMyhIm9oI0vtAmg3r9uqQzisl5UAa6Wpn8TRC3OP0DdJw2uoiQZoZ0fmDNMx7/Mv7jTSQv7z92svucfKAMcw9nd/CsqNWLh2w1N9qQrVhCub/RGs9e1tkZkEXy3xrLgcp76rs8K5//Lu2NLjsbcfa4GSBl0/rVMLdggW6M08B46Y4whxvTbtKv9rVHS6VIKhTQYMqFzri1wAGPO0llwJA3DH194SzDC8LZ1k5eLwK4rKFKsLeq7HjrJO5fvUY13/gYorjXb8l6/qlU2l07F0pU1tnj8HQpdx7Y3Bg8KPalJvIINp3X2un0cna3ffhk3r0PrsCDRMk7hrnk43yCbir2ePg2SwNeolALOgQY/y+jW5raN/rdtY2BWnUzU49vdExltRqdHTIXXVkYXyiTXFk1xHDvnQ+NQ77ag3l+7cRG6ZSPkETkS0ud9NV+c6zWbjJx3Bbhf/9fQsrB3ZSXdFJmNBUtwNjQFERg9ZXlGALsizrdAHUcX7R+lGLNw6oO1b4eA5W+Az6Rggbctu2unCYeXnHdeEOqko2TlcabkP3PzKqo6GS17Ah0vCgpMqGtZQjqTopqWIW192k1+CkKl/tKlkg9lZeVbuVLLWCl5yOXdmwWOdjsStV0vqa7Eo5A+uTXZXeM7syDNkCn1Z2RVpAX+fArmBB0YdlVxqbnzgZvRIUG430aiKgV7YpKgUdlGE5MHhvY1hwUfz7ZVjiBfMDEyxHtJhpJFidECykOkC9g7IrB4Z3I7s6Kbti5tZdtcqWXZm61WRXejchH/9eIqTzaNQVuVJ33WkntzJarjgdtXJg2DhSq1NSK3nTsyVNr1tqBWJtywF1Y9LUCnF2ocO+zoBaOTCa/7DUCtkwFhyUWjlwNcBIrUTUSvS2+UF5lcpeULSvJvjowut90uyx8JrPBumit7aIomenPzFrEJ6IX032MdgPw1510z6D+UFVEVUoj6gFUUsXVMcOjFoS7005fiXDPnG+7eUOQ3HDCtZbI9QKmc7kXYH8FBfigxTZ9Q4WxwArOBnsFyA+cLLf0JsRqS6ouB40IlWVvpcWM3feVex0JovWdBM65kEXranKuLS4Betl1xbLY33HIK47ztQ8dDqW4bjN96VJ5gw7Q3LBj/lc5NK1ypROuHaN7G5/UrjU0PaHmdGnfwA=</diagram></mxfile>
<mxfile host="app.diagrams.net" modified="2023-06-19T20:32:30.800Z" agent="5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36" etag="Gs4bWszSsx3SiBIGjquS" version="20.5.1" type="device"><diagram id="na7FgJPB3qsZf9mXNy79" name="Page-1">7V1tc6M4Ev41rrr9EBcgwPAxL5Ns1c3dZme29mY+EizbzGDkA5w48+tXgIRBLRvZAWwneKomIEDgbvXT/bRaeIRul5uH2Fst/kOmOBwZ2nQzQncjw9B1x6Z/spbXosV2zKJhHgdTdtK24WvwC7NGjbWugylOaiemhIRpsKo3+iSKsJ/W2rw4Ji/102YkrN915c0xaPjqeyFs/V8wTRdFq2Np2/bfcTBf8DvrGjuy9PjJrCFZeFPyUmlCn0boNiYkLbaWm1scZsLjcimuu99xtHywGEepygV/Ot9+/PEt+nvx43m2+fXwZfEz/X5lsW6evXDNvvFNML9Nnr9gb4pj9uTpKxcH/RKrbHO9DO9jb0k3b14WQYq/rjw/a3+ho4C2LdJlSPd0ujn1kgWesp0ZiVKmY93MrmUi1emz06u4IFG2NwvC8JaEJM7vjO7zD22H35t/BxyneFNpYnJ4wGSJ0/iVnsKOcpWwMcmV+bJV8MRhbYuKci0+KD02qOZlz1u50w0m+kPUYAE1XK9WYeB7aUCid6oFR2tUQwkD/agBGsPIsMM0k1/wXFOB/f91Zrc3ta159pdCDr33gplPcTVtyDsozjiyz6Groauhq5N1xa94isUWsWcBrBuQmWP5cjPPIqjxk5cE/nhK/PUyh9YbLwzmET3Bp7sUUtBN6D3h8JEkQe4bKgcy6KUuI/wsnLAMptPsacoTrlmX5QEB5WezmeH72cOlMfmJK0em9pNt2e3g/6QO/7oJ8b9sq+K/0Rn+TyT4L2gzDCJcCoZHhLogwojkJ3HNhXiW7pN+QgdHEM3/InRo3F3p25bP+YV3aNvyhfvmfHCnXuo95U+mScZFXJx7syJBlOaism5G1l3eEqe3JKJfwgty9WEvSV9wkoqRQgtaNic1LSNdh1qWBVtORzp2gI6p5vDYT56Bruk3TAVjrdkDU/MezQPTBLqXoUNM1tE014EmidbyffaQRgsK0jVBQzwYrWjIligIdWWF9gRoAk8pKWO7dOwuyJxEXvhp2yrIbHvOZ5JbVaa8HzhNX5kkvXVK6qqlAoxfv7Hr853v2c7Y4rt3m+rBu1eJegy2XwFMLf/sM6WErGMf74tKmaBTL57jdB98sR4zae3VfIxDSiye6xRXpkd26WOGIBWTdgTkts2xI9hr8bDsQmFAlE/yhkhdA2b8QOgYSDN4NjQdjqAwDFaJAlMC5lZ3jVMLO1NT5hod4wnZLblGZIu+0S4lXDFLV2aWqCtyJGNHopSj6XWWdMmgL/QSGsnUhduMbDLTwZsgLS2TblcMk+5t7TLb4Wa5NeexfphB53uPOA6o2DLsvuvJes0TWa9hOeOJVe+l+FbAekFfyBKRwAJ9dY0EBhiWWe5K6kLywDgbhjgJflUip8oIVffcqqMXDp295sUSmezpRmWuSgE8lAfG2wIo91LdM+rEPRuqBs4z3+0Z+Jv0aOx3oRJwv2wX6qISmk7lQo3BhXZqYZxr9u1CkYZac6GUj/XtQvkovHQXalyKCy0nCfv1oT34QqRoqa51Xr4Q7fWF/31nvhDpk/GpXaF9kCtkebVaclKY3nz0Ugo3Ud5iaAgkaY2jnedODTSO3op0LdlMJmt7qzvSRHckpOZUfZHYkWEKHXXtiFzoiLIKDjD1crvwKMSH0kxPXceHmmd9CBQfyQSJlf2Tma2df9oxWxvMkMAU0ERqtGNr9xh6k9kiBbPlM1r+Og5fb2LP/5m5giZNbNWW64XC6+9qOmo20wMKQwxxtgLCpCGTeFezFaYCY3hH8pbNAcrkbdogRm5P5DAWACIfZgEPVLRIciZQz73OAprmoOSup3qpA5NEmf2qGdbVDWpuWc26g06sZIWyjffkIx1XNhcodZMwldSe1GGsDqQ+5DKPz5DwSLvrXCYSSAYdSGPjyFymYTn1vnR3bIuddcwhLTitcZHJTG5fZ5/MtGAo9RgTHyc13i6pscwEdJXkErvOmLyx2oxklZdFCmAny9cPY/n7anQ6LHZ3NauO4qYpm46S+U0xCdOe5g7LwZ0LhB8I4MdjNR/ZjVjN81ZdY7UtLFzRJ2gsph5UsdoR50dtNNaFbG/XUA0Dt8d1dtmFQTW3o/OHalj8OkB1M1TLA+5+ofoyo+3+oNpRhGpOVvuG6jysttuBarAAomOctmFIfZk4fSkhtQ2z/gNON+K0bLKmV5C2jQGk94E0H9bN1SGtV0qqgTQy9dbiaYSEx+kapOG01UWCNDei8wdpmPf4l/8bbaD/jCv6H1WIdu2n9yR+IAQmoM5vddlRy5cOWO9v1i3EsiWTgDwn0stKswn0s9zBZnJQcrHaDhf7x78r64OL3nYsEI4XZPm0ThR8LlilO/M1MG7yI9z7VrSrdatd0xXyCZJiFSTxxt0pF3rjVwyjnua6Swkq7pj/eyJpSpa5xyy9PFnhqGhhinD2auw46+T+Xz/G/x+4ouJ4/z9R9f/K+TQ29q60sWPyx+DoUuy9MUKwxFFtq81mUO17r5XT2Izt7vuImT12n13Rhg2yd7Xz6UbxBMLV/HHIbJbgTsKQCfSKEXnfRrc1tO9VO2uah7SqZqef3ug4VWo0OjbkrlqyMDHbprmqi4lhX6aYH4d9Ncbz3duIA/Mpn6CJqFaYe8mqePHZLNhkY7ipzP/+voGaAzspr2glTKir24UxoKySwegqSnAkqZZ1sgDqOL9o/agVHAcUH2tiPAfLfHp9LYQDCW5TcThMv7zj4nAXlXUbp6sPd6D7HxjV0VApatiSabhXUuXAgsqBVJ2UVHGLa2/mq3dSlS15VawSeyuvqtxKlVrBS07HrhxYsfOx2JWuaH11dqWdgfWpLk3vmF1ZlmqVTyO7oi2gr3NgV7Cq6MOyK4NPUpyMXkkqjgZ6NZLQK8eW1YP2yrBcGLw3MSy4Mv79Miz5qvmeCZYrW9E0EKxWCBbSXaDeXtmVC8O7gV2dlF1xc2uvZGXLrmxzUmdXZjshn/hyImSKaNQWudJ33Wknt7IarjgdtXJh2DhQq1NSK3XTcxRNr11qBWLtiQuKx5SpFRLswoR9nQG1cmE0/2GpFXJgLNgrtXLhkoCBWsmoleyV873yKp2/pWhfYfDR1df7pNlh9bWYDTJlr26RRc9ud2I2IDxRvxrvY7Afhr2atnMG84O6JitTHlALopYpqY7tGbUUXp5y/HKGfeJ82xse+uKGJaw3RqglMp3JCwPFKS4kBimqix4mAgMs4aS3n4H4wMl+y6xHpKak4rrXiFTXul5fzN15W7HTmaxcM23omHtduaZrw/riBqxXXWCsjvUtg7jpumP70OlYjuOO2JehmDNsDcklv+hzkevXSlM64QI2urv9XeFCQ9tfZ0af/gE=</diagram></mxfile>
2 changes: 1 addition & 1 deletion docs/how-it-works.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions scripts/pprof.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@

SCRIPT_PATH=$(dirname "$(readlink -f "$0")")

$("${SCRIPT_PATH}/killweb.sh")
"${SCRIPT_PATH}/killweb.sh"

profilesFor=(bigcsvreader gocsvreadall gocsvreadonebyone)
port=8084
for profileFor in "${profilesFor[@]}"
do
echo "Handling profiles for ${profileFor}"
echo "Handling profiles for ${profileFor}"
go run "${SCRIPT_PATH}/../cmd/pprof/main.go" -for="${profileFor}"
go tool pprof -http=":${port}" "mem_${profileFor}.prof" &
port=$(( port + 1 ))
Expand Down