Queue cleanup by alexcjohnson · Pull Request #159 · microsoft/Qcodes

alexcjohnson · 2016-05-09T23:02:51Z

Taking care of the last 2 points in #136 to make a more robust server architecture:

there's a new BaseServer class that all the servers inherit from, so we have a unified, well-tested pair of pathways for ask and write queries.
query handlers are no longer responsible for replying to the query; instead, the caller can invoke any handler with the ServerManager methods ask or write, and these methods encode into the query which flavor it is, invoking the appropriate processor on the other end: ask will reply with the handler's return value, write puts nothing on the queue.
there is no error_queue anymore, the response (to ask only - write never responds) just indicates whether it is valid or an error and is handled accordingly.
write queries report their errors locally (using logging), rather than to the server's response queue. While this might make these errors less visible because they will never bubble up to the main process, it means these errors get reported faster (you will see them in the subprocess monitor right away) and it relieves potential confusion in that the next query after the error occurs will be the one to throw the error, even though that next query is likely not a problem itself.

This is all kind of low level so I'm not sure who's interested in looking through it ( @guenp ? @spauka ? @AdriaanRol ? @damazter ?) , but it makes me a lot happier anyway about deleting the timeout in #144

damazter · 2016-05-09T23:44:05Z

qcodes/utils/multiprocessing.py

+            try:
+                # the only way you'll get a response without asking for one
+                # is if we don't understand the query type code
+                self._response_queue.put((RESPONSE_ERROR, error_str))


I think it is dangerous (or at least annoying) to put something in the return queue if the user doesn't expect it. Granted, if the user uses QUERY_WRITE then there is no problem, but I see mysef making the typo WIRTE [sic] which would lead to quite annoying situations.

I also see that there is no solution to this as the opposite solution will cause problems when the user types AKS [sic].

When I write my servers in python I use the convention that every command returns something. In the case of a write command, the return could just be OK.
Then I can envision problems with multiprocessing as you don't want to wait for your write to finish before continuing to the next command, but I guess that there is always a moment that you want to make sure that the WRITE command executed well, and that would be the time to check what is in your return_queue.

bottom line: I don't like the Write command a lot. but I would like to see the error handeling make an explicit case for a code_value error:

if code == QUERY_WRITE: logging.error(error_str) elif code == QUERY_ASK: self._response_queue.put((RESPONSE_ERROR, error_str)) else: logging.error(error_str) self._response_queue.put((RESPONSE_ERROR, error_str))

thereby You are posting the error double, but you are sure the user will actually see the error some way or the other

That's a good idea, double-posting if the mode is unclear. Granted, these codes are constants, not strings - well, the constants are strings... but if you use them as constants, typos will appear while you're coding rather than at run time.

You're right to be wary of write - there's really only one place I strongly prefer it, and that's posting data from a Loop to the DataServer. In that case I really don't even want to wait for confirmation that the write has happened, because the DataServer could be doing other long-running things like writing a data file, or syncing to a live plot, and we don't want that to slow down the Loop. But if we want to have other commands happen in a non-blocking way we could certainly make a third mode, QUERY_CONFIRMED_WRITE or something, that would acknowledge receipt after it parsed the basic structure of the query but before acting on it.

damazter · 2016-05-10T00:21:39Z

Reading this code, I get quite confused. (this is not specific to this pull request, but If I am to look at this code, I need a bit better understanding)

While reading the code I get confused which thread is containing which classes.
I got as far as that ServerManager creates a new thread, which will behave comensurate to server_class, I think this serverclass is intended to be a child of BaseServer.
would it be i good idea to put this intention explicitely in the docstring of ServerManager?
I get that this intention is not a necessity (it shouldn't be), but it would make it more clear to get a quick understanding of the grammar while reading the code. That the reader can look up BaseServer, To understand what kind of class is intended here.

similarly the QcodesProcess object is not very intuitive for readers not familiar with this muliprocessing package.
Would it be possible to put a few helpful comments in the docstring which explains that this object represents a runnable thread, and explain what the target kwarg does (although it is part of the parent, mp.Process)

From the code it very hard to see now, that QcodesProcess will behave according to the server_class argument passed in ServerManager.
In the QcodesProcess code it is completely invisable how behavior is passed to this class.
A simple line stating that QcodesProcess will "run", using the command passed to target which is fetched from the server_class variable.

tl;dr
Thinking about it, I think the solution should be, which I would think is a lot more clear, is passing the server_class to QcodesProcess instead of target. Then, looking at QCodesProcess it is immediately clear that this is some sort of server.

damazter · 2016-05-10T01:16:35Z

Now for some annoying suggestions (sorry :( )
Shouldn't the class InstrumentManager be called InstrumentServerManager
the managing refers to a server of the InstrumentServer kind, not to the managing of instruments and definitely not to managing a single instrument.

I don't find the name InstrumentConnection intuitive. It seems to serve the exact same purpouse as the RemoteInstrument classs. Shouldn't these two classes be merged? This merging isn't trivial, but it feels like everything that is part of InstrumentConnection should be part of RemoteInstrument

This would make the hierargy of the classes a bit more clear. An InstrumentServerManager contains an InstrumentServer (which will contain all real instrument drivers) and RemoteInstruments which represent the local mirror version of the real drivers. Hence there is nice symmetry between managers containing remoteinstruments and servers containing real instruments,
managers are responsible for talking to servers and remoteinstruments are responsible for talking to real instruments (via the manager and the server)

alexcjohnson · 2016-05-10T07:56:49Z

Shouldn't the class InstrumentManager be called InstrumentServerManager

Yes, that makes sense.

it feels like everything that is part of InstrumentConnection should be part of RemoteInstrument

Yep, that would be much cleaner too, I will try to do this. It only exists because I was originally using the instrument class itself as the remote, but as the remote is now a separate dedicated class it is unnecessary and confusing.

damazter · 2016-05-10T17:47:59Z

@alexcjohnson , Then I will wait with continuing looking at this code until you make these changes, and then go through it again.

I think QUERY_CONFIRMED_WRITE is not really a priority at this point as a third mode, as if you want confirmation, you should just use ask.
On the other hand, it might be a good idea though to change WRITE such that it behaves as QUERY_CONFIRMED_WRITE, then you are always garantueed to get a reply back from the server after every command, unless stuff is really going wrong, which will then also be indicated by not getting a response from the server.

abort button is quiet, qc.halt_bg() is loud

alexcjohnson · 2016-05-18T07:52:17Z

qcodes/loops.py

+        forcibly terminating
+
+    traceback: (default True) whether to print a traceback at the point
+        of interrupt, for debugging purposes


@MerlinSmiles re #178 - plain qc.halt_bg(), if you enter this in a cell, will only wait max 5 sec, and try to print a traceback (into the subprocess widget, if you're in a notebook). But the abort button in the subprocess widget (here) will wait longer (for whatever get() or set() is happening to finish and leave the system in a clean state) and not print any traceback.

MerlinSmiles · 2016-05-18T08:26:18Z

@alexcjohnson
Great! The last changes seem to work just fine! Nice to have the option with and without traceback.

damazter · 2016-05-18T22:51:29Z

qcodes/instrument/remote.py

-        manager.connect(self, instrument_class, args, kwargs)
+        self._server_name = server_name
+        self._shared_kwargs = shared_kwargs
+        self._manager = get_instrument_server(self._server_name,


@alexcjohnson looking at this line of code,

self._manager = get_instrument_server(self._server_name,

is this really the manager?
then get_instrument_server maybe be ``get_instrument_server_manager ?

yes, that was a sloppy name :)

alexcjohnson · 2016-05-19T09:55:18Z

@damazter I think I've addressed all your comments, want to take another look? Since I had already made so many changes to multiprocessing.py I took the opportunity per #181 to rename and break it up. Instead of qcodes.utils.multiprocessing it now lives in qcodes.process.* - I figured this stuff had graduated beyond just being utils :)

peendebak · 2016-05-19T18:10:07Z

@alexcjohnson

While this might make these errors less visible because they will never bubble up to the main process, it means these errors get reported faster (you will see them in the subprocess monitor right away) and ...

I have not read the entire PR, but is the subprocess monitor the qc.show_subprocess_widget()? In that case anyone not using the notebook will not see the error messages.

alexcjohnson · 2016-05-19T19:57:07Z

@peendebak

I have not read the entire PR, but is the subprocess monitor the qc.show_subprocess_widget()? In that case anyone not using the notebook will not see the error messages.

There are two safeguards against this (which are not new in this PR, they've been there since I made the StreamQueue):

If in_notebook() is False the subprocesses will not connect to this anyway. We should make sure this test is as accurate as possible - refining it ala your spyder test in make plot updates work outside ipython notebook #155
If the subprocess does connect, then prints something but the queue hasn't been read recently (3 seconds by default) it will also print to the default stdout or stderr (sys.__stdout__ and sys.__stderr__). Though in some of these situations the initial stdout is different from sys.__stdout__. In a notebook for example, sys.stdout is an ipykernel stream that gets picked up by the notebook, but sys.__stdout__ prints to the notebook server terminal console. If in this case we just want to make sure the user sees the message no matter what, we might be better off printing to both streams if they're different...

MerlinSmiles · 2016-05-22T16:36:58Z

I can still mess the halt things up.
When I start a loop, hit the stop button in the notebook, and then hit the abort button in the widget, the notebook freezes. Once I got this error in the terminal:

Traceback (most recent call last):
  File "c:\users\qdev\anaconda3\lib\site-packages\ipykernel\eventloops.py", line 25, in pr
ocess_stream_events
    def process_stream_events():
KeyboardInterrupt

But that was only once out of 10 times.

alexcjohnson · 2016-05-24T12:14:56Z

@MerlinSmiles

When I start a loop, hit the stop button in the notebook, and then hit the abort button in the widget, the notebook freezes.

Interesting... I can also get some horrible things to happen by pressing the stop button while pyqtgraph is open (even if no plots are open but pyqtgraph has been called in the past). Perhaps we could find a way to have pyqtgraph processes also ignore KeyboardInterrupt?

Anyway, this symptom is the same on master and on this branch, can we get this merged and then troubleshoot that separately?

MerlinSmiles · 2016-05-24T12:24:35Z

Yes, sure. Not sure if I should send the 💃 if thats the case, here you have it :)

alexcjohnson · 2016-05-24T12:28:10Z

Thanks @MerlinSmiles
@damazter I think I've addressed all your earlier comments, feel free to create an issue (or discuss on slack!) for anything else that occurs to you after I merge this.

alexcjohnson added 4 commits May 9, 2016 10:02

''' -> """ in multiprocessing.py

ffd7e63

simplified server architecture

41e13a5

cleaner LogCapture syntax

4e74ad2

fix tests for the new server structure

c9bc8e5

damazter reviewed May 9, 2016
View reviewed changes

alexcjohnson added 8 commits May 12, 2016 10:56

Merge branch 'master' into queue-cleanup

7b02cac

qcodes.test_part

0cb4471

Merge branch 'master' into queue-cleanup

10a8516

ignore SIGINT in subprocesses

04a529c

separate debug and quiet measurement interrupts.

c7aef8c

abort button is quiet, qc.halt_bg() is loud

Instrument(+Server)Manager

892eceb

lint instrument.server

050e5f0

axe InstrumentConnection and fix tests

33dab07

alexcjohnson reviewed May 18, 2016
View reviewed changes

damazter reviewed May 18, 2016
View reviewed changes

alexcjohnson added 3 commits May 19, 2016 10:07

get_instrument_server(+ _manager)

51068ce

overhaul docstrings in multiprocessing.py

cac570b

reorganize qcodes.utils.multiprocessing into qcodes.process.*

f82f5b0

alexcjohnson mentioned this pull request May 19, 2016

Mock #183

Closed

alexcjohnson mentioned this pull request May 24, 2016

fix check for the case res is a numpy array #186

Closed

alexcjohnson merged commit 6375f61 into master May 24, 2016

alexcjohnson deleted the queue-cleanup branch May 24, 2016 12:28

Conversation

alexcjohnson commented May 9, 2016

Uh oh!

damazter May 9, 2016

Choose a reason for hiding this comment

Uh oh!

alexcjohnson May 10, 2016

Choose a reason for hiding this comment

Uh oh!

damazter commented May 10, 2016

Uh oh!

damazter commented May 10, 2016

Uh oh!

alexcjohnson commented May 10, 2016

Uh oh!

damazter commented May 10, 2016

Uh oh!

alexcjohnson May 18, 2016

Choose a reason for hiding this comment

Uh oh!

MerlinSmiles commented May 18, 2016

Uh oh!

damazter May 18, 2016

Choose a reason for hiding this comment

Uh oh!

alexcjohnson May 19, 2016

Choose a reason for hiding this comment

Uh oh!

alexcjohnson commented May 19, 2016

Uh oh!

peendebak commented May 19, 2016

Uh oh!

alexcjohnson commented May 19, 2016

Uh oh!

MerlinSmiles commented May 22, 2016

Uh oh!

alexcjohnson commented May 24, 2016

Uh oh!

MerlinSmiles commented May 24, 2016

Uh oh!

alexcjohnson commented May 24, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants