Added support for Structured Arrays#211
Added support for Structured Arrays#211aliosmankaya wants to merge 11 commits intoskops-dev:mainfrom
Conversation
|
Thanks for the PR @aliosmankaya . Could you please also add tests for the cases you're adding support for? We should also make sure when the user uses these types, the inference side of things work. Which means we should test that as well, and figure out how to pass the data to the model on the api-inference-community side. |
|
@adrinjalali can you review the tests? |
|
Those are nice tests, thanks @aliosmankaya What I'm wondering, is what would happen if we train a scikit-learn model, persist it, push it to the hub, and try to get inference from that model using |
|
@adrinjalali you mean, you want to use If so, I reviewed the function and added support to fix structured array bug may be in the future. |
|
Here's an end to end test:
Note that on the hub side, things are handled by this repo: https://github.com/huggingface/api-inference-community/tree/main/docker_images/sklearn , so we might need to fix thing on that side as well. |
… to get_model_output for structured array
|
Hi @adrinjalali, sorry for the late response. To test the hub-side, I pushed a Linear Regression model that trained on iris dataset that converted to structured array.
Can you review the commit? |
| col: (data[:, idx] if len(data.shape) > 1 else data[idx]) | ||
| for idx, col in enumerate(_get_column_names(data)) | ||
| } | ||
| if data.dtype.names: |
There was a problem hiding this comment.
With that condition, we can figure out the data is a structured array or a regular numpy array.
| if data.dtype.names: | ||
| inputs = { | ||
| col: ( | ||
| [(x[0] if len(data.shape) > 1 else x) for x in data[idx].tolist()] |
There was a problem hiding this comment.
For converted the structured array data to a list with iteration, check the data shape for index operation.
|
@aliosmankaya the error is due to likely how the inference in |
|
@merveenoyan sure, I'll look. |
|
Hey @adrinjalali @merveenoyan , to expanded the test, I added new models to my account . Thus I tested structured and regular numpy arrays both. On the test side, I used InferenceAPI from #Test1: model data = np.array([1, 2])
# inputs condition with the changes...
inputs = {"data": {"x0": [1, 2]}}
InferenceApi(repo_id="aliosmankaya/reg_arr_model_1_dim")(inputs)#Test2: model data = np.array([[1, 2], [3, 4]])
# inputs condition with the changes...
inputs = {"data": {"x0": [1, 3], "x1": [2, 4]}}
InferenceApi(repo_id="aliosmankaya/reg_arr_model_2_dim")(inputs)#Test3: model data = np.array([(6.1,), (5.5,)], dtype=[("col1", "<f8")])
# inputs condition with the changes...
inputs = {"data": {"x0": [6.1, 5.5]}}
InferenceApi(repo_id="aliosmankaya/structured_arr_model_1_dim")(inputs)So the point is, we prepare |
|
@aliosmankaya as we talked on twitter, it looks good to me. I think it's good that you make the necessary changes in |
|
Thanks for the work @aliosmankaya . A few things before I could review this: The feature names you're using are the same as the autogenerated ones on the inference side. Could you please:
Also, for reproducibility, could you please include the code generating the repo / model in the repo you create? |
|
hey @adrinjalali , I just updated the models and also uploaded its codes. Can you review it again? Also, models works on widget as well. |
|
@aliosmankaya sorry but I can't find the changes you're referring to. The links you have in this comment (#211 (comment)) still have models which have |
|
Note that the issue might be that sklearn is not detecting feature names from a structured array (that might be very well the case), in which case, then we can't support them either. |

Added support for Structured Arrays to _get_column_names function. So, with that updates data could be: