WIP: common libcurl#2303
Conversation
a69c88b to
1f8de50
Compare
|
Gonna try to get through it tomorrow, but as a quick test: These aren't test failures, but they're warnings that I feel we shouldn't suddenly be getting. All 6 are either "submitRemove bad state" or a couple different "curl transfer error" |
|
Do we have benchmarks / memory for the cases where it's supposed to be the most benefit (e.g. concurrent connections)? |
|
Did you consider moving the CDP socket into the network thread and treating it as just another completion? |
1f8de50 to
bd49b64
Compare
karlseguin
left a comment
There was a problem hiding this comment.
I haven't gone through the heavy changes in WebSocket, Network and HttpClient. Sorry.
| const CLOSE_PROTOCOL_ERROR = [_]u8{ 136, 2, 3, 234 }; //code: 1002 | ||
| // "private-use" close codes must be from 4000-49999 | ||
| const CLOSE_TIMEOUT = [_]u8{ 136, 2, 15, 160 }; // code: 4000 | ||
| pub const WsConnection = @This(); |
There was a problem hiding this comment.
This isn't necessarily a WebSocket connection (it starts off as an HTTP connection, answers HTTP requests and might never upgrade to a WebSocket), and it has enough CDP-specific knowledge that it belongs in cdp/Connection.zig (or cdp/Client.zig or something).
The Reader could be extracted into network/WsReader.zig, since that's generic websocket stuff.
| json_version_response: []const u8, | ||
|
|
||
| pub fn init( | ||
| self: *WsConnection, |
There was a problem hiding this comment.
From where this is used, not sure why this takes a *WsConnection versus returning a WsConnection, nothing here needs a stable pointer.
| // EOF from read(). The poll blocks for ~24 days rather than tracking | ||
| // an app-level timeout. Capped at i32-max because posix.poll narrows | ||
| // to c_int. | ||
| const wait_ms: i32 = std.math.maxInt(i32); |
There was a problem hiding this comment.
We should be using a sane timeout for handshake. As-is, a silent client consumes a slot + resources.
There was a problem hiding this comment.
A failed client will be dropped via TCP keepalive
There was a problem hiding this comment.
A failed client will, but not a silent one. Start the serve, and then connect to it socat - TCP:127.0.0.1:9222
| .events = posix.POLL.IN, | ||
| .revents = 0, | ||
| }}; | ||
| const n = try posix.poll(&pfds, wait_ms); |
There was a problem hiding this comment.
Could this be done in the network loop? I'm surprised to find any network I/O happening in the worker thread.
| try self.send(self.json_version_response); | ||
| // Chromedp (a Go driver) does an http request to /json/version | ||
| // then to / (websocket upgrade) using a different connection. | ||
| // Since we only allow 1 connection at a time, the 2nd one (the |
There was a problem hiding this comment.
This should be revisited, since we no longer only allow 1 connection at a time. Removing this would speed up every other client.
| try self.ws.init(socket, self.app.allocator, json_version_response); | ||
| errdefer self.ws.deinit(); | ||
|
|
||
| try self.browser.init(app, .{ .env = .{ .with_inspector = true } }, .{ |
There was a problem hiding this comment.
If the notes about ChromeDB are right, and it uses 2 connections, 1 to get the json version, then 1 to upgrade to websocket, than this is pretty wasteful for that specific case. But it does simplify things.
|
|
||
| const http_client = frame._session.browser.http_client; | ||
| const http_active = http_client.http_active; | ||
| const http_active = http_client.handle.http_active; |
There was a problem hiding this comment.
line 146, making a copy of http_client
Since this is an easy mistake to make and http_client is loaded in many places, what do you think about a
pub fn httpClient(self: *Browser) *HttpClient {
return &self._http_client;
}| }; | ||
|
|
||
| pub fn init(app: *App, opts: InitOpts) !Browser { | ||
| pub fn init(self: *Browser, app: *App, opts: InitOpts, cdp_client: ?HttpClient.CDPClient) !void { |
There was a problem hiding this comment.
cdp_client is optional, so it feels like it would fit nicely in InitOpts.
| clients: std.ArrayList(*Client) = .{}, | ||
| client_mutex: std.Thread.Mutex = .{}, | ||
| clients_pool: std.heap.MemoryPool(Client), | ||
| pending: std.ArrayList(*CDP) = .{}, |
There was a problem hiding this comment.
pending and conns are separated for the shutdown path? To know whether or not to send the close packet? If WsConnection knew what phase it was in (which I think is reasonable), you could then have a single list..always calling ws.close() which would do nothing when self.mode == .http. Not a huge difference either way, was just trying to understand why the shuffling between the two lists.
| }; | ||
| self.drainQueue(); | ||
|
|
||
| self.preparePollFds(multi); |
There was a problem hiding this comment.
I don't think preparePollFds is being used anymore, but it's still in the code.
| break :blk idx; | ||
| } else null; | ||
|
|
||
| libcurl.curl_multi_poll(self.multi, extra_fds[0..extra_len], timeout, null) catch |err| { |
There was a problem hiding this comment.
Can curl_multi_wakeup instead of the wakeup pipe?
| } | ||
| for (to_remove.items) |conn| self.handleRemove(conn); | ||
|
|
||
| var to_add: std.ArrayList(*http.Connection) = .empty; |
There was a problem hiding this comment.
to_remove and to_add exist to reduce the submission_mutex lock? 2 possible optimizations:
1 - Store them as fields on Network so that, instead of deinit, we can clearRetainingCapacity
2 - Use 2 lists pending_add_1 and pending_add_2 and have pending_add = &pending_add_1. Then you need a very short lock to swap them.
bb3d63e to
195fb89
Compare
|
1 - If I remove all but the first test from websocket.html, the test hangs. This isn't the case in browser/src/browser/HttpClient.zig Line 788 in d360fcc |
| // disconnect inline: we're inside a libcurl callback | ||
| // and tearing the conn down here would UAF the easy | ||
| // handle. Curl will deliver normal completion when the | ||
| // server closes the socket per RFC 6455 §5.5.1. |
There was a problem hiding this comment.
I don't think we should rely on servers following specs.
195fb89 to
6fc0370
Compare
| std.Thread.sleep(std.time.ns_per_ms); | ||
| } | ||
| if (self.handle.in_use.first != null) { | ||
| log.warn(.http, "deinit with active conns", .{ |
There was a problem hiding this comment.
This seems like a more serious issue. At this point, we'll crash N seconds later in a seemingly not-related part. I'd rather just crash here.
Also, can the loop not replace the spins and sleep with a call to handle.poll with a timeout?
Finally, is there any danger in calling processMessage here? Can we have a tailor-made cleanup?
6fc0370 to
46291f3
Compare
46291f3 to
66b9600
Compare
Reserves the CLI flag and LP.configureLoading externalStylesheets field so drivers can adopt the API before the fetch implementation lands in a follow-up that depends on lightpanda-io#2303. The bool is intentionally unread in this PR. Mirrors the existing --disable-subframes / --disable-workers plumbing; the CDP field extends LP.configureLoading alongside subFrame and worker without breaking existing callers. Refs lightpanda-io#2343
Reserves the CLI flag and LP.configureLoading externalStylesheets field so drivers can adopt the API before the fetch implementation lands in a follow-up that depends on lightpanda-io#2303. The bool is intentionally unread in this PR. Mirrors the existing --disable-subframes / --disable-workers plumbing; the CDP field extends LP.configureLoading alongside subFrame and worker without breaking existing callers. Refs lightpanda-io#2343
This PR finally moves network requests to a shared event pool.
In serve mode, Lightpanda creates a Network object on the main thread that owns the ws listener and libcurl multi handler. A separate thread with separate CDP and Browser objects is created for each CDP connection. A Network.Handle object is created for each thread, which services the current thread's requests, passes them to the Network, and manages synchronization.
HttpClient acts as a small wrapper around Network.Handle, adding browser-specific logic (caches, robots, etc.). XMLHttpRequest/Fetch/WebSocket, in turn, wrap HttpClient in a JavaScript API.
Future changes: Currently, CDP and pages are tightly coupled to a single thread. To create multiple pages in a single connection and maintain a detachable browse context, we'll have to move CDP to the main thread (even though it's not part of Network), and leave only Browser on worker threads.
It would also be helpful to clean up the Network<->HttpClient<->Connection relationship. Currently, request state is spread across Transfer, Connection, and queues in Handles, while WebSocket depends on HttpClient (even though it only needs a Handle pointer). We need to move the libcurl logic entirely to Handles, leaving the operations in http.Connection (since not all libcurl calls are thread-safe), and move some network logic (such as redirects and buffer ownership) to Network.
flowchart Server == "spawn(handleConnection)" ==> WorkerT Handle == "submit*<br/>(pending_add/remove/ops)" ==> Network Network == "on_complete<br/>(pushCompletion)" ==> Handle subgraph MainT["Main thread — Network.run()"] Network["Network<br/>(libcurl multi, accept)"] Server["Server<br/>(onAccept)"] Network -- "accept → onAccept" --> Server end subgraph WorkerT["CDP worker thread × N"] CDP --> Browser CDP --> Handle Browser --> HttpClient HttpClient --> Handle end