Add support for csv files as input#152
Conversation
9e17162 to
76c6a81
Compare
src/virtualship/utils.py
Outdated
| def load_coordinates(file_path): | ||
| """Loads the coordinates from a file.""" | ||
| try: | ||
| return pd.read_excel(file_path) # Try reading as Excel | ||
| except ValueError: | ||
| pass | ||
|
|
||
| try: | ||
| return pd.read_csv(file_path) # Try reading as CSV with commas | ||
| except ValueError: | ||
| pass | ||
|
|
||
| df = pd.read_csv(file_path, delimiter=";") # Try reading CSV with colons | ||
| if len(df.columns) == 1: | ||
| return None | ||
| return df |
There was a problem hiding this comment.
Since the file_path will contain the extension, isn't is better to just infer the file type by the extension instead of try and error?
There was a problem hiding this comment.
Indeed, this will now be a problem when the read_excel fails for another reason than the file not being an xls-file (e.g. because openpyxl is not installed). So better to choose between read_excel and read_csv based on extension indeed
There was a problem hiding this comment.
The idea was that sometimes a file can have a missing extension while still being valid, but I see your points 🙌
I have changed the implementation to check based on the file extension, and also improved the error handling.
|
Thanks for contributing! |
erikvansebille
left a comment
There was a problem hiding this comment.
Thanks for this PR, very good work on the unit tests in particular! But I think the logic to decide between open_excel and open_csv should be made more robust
src/virtualship/utils.py
Outdated
| def load_coordinates(file_path): | ||
| """Loads the coordinates from a file.""" | ||
| try: | ||
| return pd.read_excel(file_path) # Try reading as Excel | ||
| except ValueError: | ||
| pass | ||
|
|
||
| try: | ||
| return pd.read_csv(file_path) # Try reading as CSV with commas | ||
| except ValueError: | ||
| pass | ||
|
|
||
| df = pd.read_csv(file_path, delimiter=";") # Try reading CSV with colons | ||
| if len(df.columns) == 1: | ||
| return None | ||
| return df |
There was a problem hiding this comment.
Indeed, this will now be a problem when the read_excel fails for another reason than the file not being an xls-file (e.g. because openpyxl is not installed). So better to choose between read_excel and read_csv based on extension indeed
9f88573 to
720036e
Compare
15044d9 to
5e65979
Compare
for more information, see https://pre-commit.ci
|
Hi @erikvansebille, I made some changes as requested—thanks again for your feedback! Let me know if you have any further comments on this updated version. |
|
Looks good. Thanks for the contribution @diogomatoschaves ! |
Uh oh!
There was an error while loading. Please reload this page.