Skip to content

Conversation

@you-n-g
Copy link
Collaborator

@you-n-g you-n-g commented Nov 17, 2022

Description

Motivation and Context

How Has This Been Tested?

  • Pass the test by running: pytest qlib/tests/test_all_pipeline.py under upper directory of qlib.
  • If you are adding a new feature, test on your own test scripts.

Screenshots of Test Results (if appropriate):

  1. Pipeline test:
  2. Your own tests:

Types of changes

  • Fix bugs
  • Add new feature
  • Update documentation

@you-n-g
Copy link
Collaborator Author

you-n-g commented Nov 17, 2022

@ChiahungTai You are welcome to review this PR.

"""
This is the `ExpManager` class for managing experiments. The API is designed similar to mlflow.
(The link: https://mlflow.org/docs/latest/python_api/mlflow.html)
This is the `ExpManager` class for managing experiments. The API is designed similar to mlflow.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChiahungTai You can start from these docs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"we can have multiple Experiments with different uri" => IIUC, the words are imply the design of ExpManger is suggestion the user use different uri in each experiment?

From my understanding in MLFLow, the uri is like a store backend(sql, file...). The ExpManger is a single entry to lookup and manipulate the experiments in the uri.

So the user can change different uri to retrieve the different topic of experiements.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.
ExpManager is a singleton, but we can change its uri to get experiments from different uri.
I added some comments just now.
Please check if it is helpful

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your new comment is more clear than before.

@ChiahungTai
Copy link
Collaborator

Fail the test in test_all_pipeline.py.
I have add the test case for you in #1363.

@ChiahungTai
Copy link
Collaborator

Looks like the exception is the same as the test case #1363.

        except MlflowException as e:
            raise ValueError(
                "No valid experiment has been found, please make sure the input experiment name is correct."
            ) from e

"""
prev_uri = self.exp_manager.default_uri
C.exp_manager["kwargs"]["uri"] = uri
self.exp_manager.default_uri = uri
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the set of default_uri is not engouh. The self.exp_manager is not change the client uri if you only change default uri.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is fixed now.
The _client will not be cached. No further maintenance is required for it.

self.active_experiment = None
# When an experiment end, we will release the current uri.
self._current_uri = None
self._set_client_uri()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pump into an exception - No argument.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed now.

# Delay the creation of mlflow client in case of creating `mlruns` folder when importing qlib
if self._client is None:
self._client = mlflow.tracking.MlflowClient(tracking_uri=self.uri)
return self._client
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might need to sync the returned self._client to the updated default_uri.
Or you have set the uri in every function in access the client.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed Now.

# set the tracking uri
mlflow.set_tracking_uri(self.uri)
# start the run
# start the RuntimeError
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the start the RuntimeError mean?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed now

_ = mlflow.tracking.MlflowClient(tracking_uri=str(self.TMP_PATH))
end = time.time()
elasped = end - start
self.assertGreater(1e-2, elasped) # it can be done in less than 10ms
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertGreater or assertLesser

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the direction. And it is more readable than before

"""
This is the `ExpManager` class for managing experiments. The API is designed similar to mlflow.
(The link: https://mlflow.org/docs/latest/python_api/mlflow.html)
This is the `ExpManager` class for managing experiments. The API is designed similar to mlflow.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your new comment is more clear than before.

@you-n-g
Copy link
Collaborator Author

you-n-g commented Nov 18, 2022

@ChiahungTai
Hi, I have fixed all the comments and waiting for the CI.

I checked out your test

I didn't get what you want to assert in the test.

It seems that the exception looks reasonable.
When we want to get a recorder, Qlib tries to get an experiment first. The experiment name or id is not given.
So the default exp name is used.
There is no experiment which is active or is named with the default exp name.
So an exception of No valid experiment has been found is raised

@you-n-g you-n-g requested a review from ChiahungTai November 18, 2022 03:44
@you-n-g you-n-g mentioned this pull request Nov 18, 2022
5 tasks
@you-n-g you-n-g changed the title Optimize the implementation of uri Optimize the implementation of uri & Fix async bug Nov 18, 2022
@you-n-g you-n-g changed the title Optimize the implementation of uri & Fix async bug Optimize the implementation of uri & Fix async log bug Nov 18, 2022
@ChiahungTai
Copy link
Collaborator

The test code is pass after assign the experiment_name and recorder_name.
But in my own project, I provide the experiment_name and recorder_name.
But the code is too long to write a simple test case.
with R.uri_context(uri=uri_path): recorder = R.get_recorder(experiment_name=EXP_NAME, recorder_name="original")
I am working on another version of cached_pipeline in my own project. I might test the new commit and try it my local first.
Also I will upload a new test if I can rewrite a simple test case.

@ChiahungTai Hi, I have fixed all the comments and waiting for the CI.

I checked out your test

I didn't get what you want to assert in the test.

It seems that the exception looks reasonable. When we want to get a recorder, Qlib tries to get an experiment first. The experiment name or id is not given. So the default exp name is used. There is no experiment which is active or is named with the default exp name. So an exception of No valid experiment has been found is raised

@ChiahungTai
Copy link
Collaborator

Your new patch works fine in my own project.

The test code is pass after assign the experiment_name and recorder_name. But in my own project, I provide the experiment_name and recorder_name. But the code is too long to write a simple test case. with R.uri_context(uri=uri_path): recorder = R.get_recorder(experiment_name=EXP_NAME, recorder_name="original") I am working on another version of cached_pipeline in my own project. I might test the new commit and try it my local first. Also I will upload a new test if I can rewrite a simple test case.

@ChiahungTai Hi, I have fixed all the comments and waiting for the CI.
I checked out your test
I didn't get what you want to assert in the test.
It seems that the exception looks reasonable. When we want to get a recorder, Qlib tries to get an experiment first. The experiment name or id is not given. So the default exp name is used. There is no experiment which is active or is named with the default exp name. So an exception of No valid experiment has been found is raised

@ChiahungTai
Copy link
Collaborator

The patch LGTM. You can merge it after fix the pylint error.

@you-n-g you-n-g merged commit 994f893 into microsoft:main Nov 18, 2022
@you-n-g you-n-g deleted the r_optm branch November 18, 2022 05:11
@you-n-g you-n-g added the enhancement New feature or request label Dec 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants