Skip to content

Conversation

@FreddieChopin
Copy link
Contributor

alloc_cluster() just ignores read errors, trying next cluster until it either succeeds (finds an empty one) or runs out of clusters (after checking all of them). A large volume may have quite a lot of clusters - eg. a 16 GB SD card with a standard format has about 2 million clusters. When a "persistent" read error happens during the alloc_cluster() (after a successful mount operation) - for example a volume is physically disconnected in a very inconvenient moment or the volume is/gets damaged and all further reads fail - then this loop becomes practically infinite. In one application we found a damaged SD card, for which reads of first 600-700 blocks work perfectly fine, but any read beyond that results in a SDIO interface timing-out (the card will not switch to expected state within specified time). As the timeout for the operation is ~100 ms, then the function would loop for over 2 days. The same card just fails to work in a PC, where any read beyond first ~350 kB (which is about 700 blocks) fails with an I/O error.

Fix this by returning from alloc_cluster() with an error when any read operation fails.

Fixes #15

alloc_cluster() just ignores read errors, trying next cluster until it
either succeeds (finds an empty one) or runs out of clusters (after
checking all of them). A large volume may have quite a lot of clusters -
eg. a 16 GB SD card with a standard format has about 2 million clusters.
When a "persistent" read error happens during the alloc_cluster() (after
a successful mount operation) - for example a volume is physically
disconnected in a very inconvenient moment or the volume is/gets damaged
and all further reads fail - then this loop becomes practically
infinite. In one application we found a damaged SD card, for which reads
of first 600-700 blocks work perfectly fine, but any read beyond that
results in a SDIO interface timing-out (the card will not switch to
expected state within specified time). As the timeout for the operation
is ~100 ms, then the function would loop for over 2 days. The same card
just fails to work in a PC, where any read beyond first ~350 kB (which
is about 700 blocks) fails with an I/O error.

Fix this by returning from alloc_cluster() with an error when any read
operation fails.

Fixes dlbeer#15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ignoring read errors in alloc_cluster() can cause an "almost infinite" loop with broken SD cards

1 participant