This is raw, unfiltered and experimental.
Sendspin is a multi-room music experience protocol. The goal of the protocol is to orchestrate all devices that make up the music listening experience. This includes outputting audio on multiple speakers simultaneously, screens and lights visualizing the audio or album art, and wall tablets providing media controls.
- Sendspin Server - orchestrates all devices, generates audio streams, manages players and clients, provides metadata
- Sendspin Client - a client that can play audio, visualize audio, display metadata, or provide music controls. Has different possible roles (player, metadata, controller, artwork, visualizer). Every client has a unique identifier
- Player - receives audio and plays it in sync. Has its own volume and mute state and preferred format settings
- Controller - controls the Sendspin group this client is part of
- Metadata - displays text metadata (title, artist, album, etc.)
- Artwork - displays artwork images. Has preferred format for images
- Visualizer - visualizes music. Has preferred format for audio features
- Sendspin Group - a group of clients. Each client belongs to exactly one group, and every group has at least one client. Every group has a unique identifier. Each group has the following states: list of member clients, volume, mute, and playback state
- Sendspin Stream - client-specific details on how the server is formatting and sending binary data. Each role's stream is managed separately. Each client receives its own independently encoded stream based on its capabilities and preferences. For players, the server sends audio chunks as far ahead as the client's buffer capacity allows. For artwork clients, the server sends album artwork and other visual images through the stream
Roles define what capabilities and responsibilities a client has. All roles use explicit versioning with the @ character: <role>@<version> (e.g., player@v1, controller@v1).
This specification defines the following roles: player, controller, metadata, artwork, visualizer. All servers must implement all versions of these roles described in this specification.
All role names and versions not starting with _ are reserved for future revisions of this specification.
Clients list roles in supported_roles in priority order (most preferred first). If a client supports multiple versions of a role, all should be listed: ["player@v2", "player@v1"].
The server activates one version per role family (e.g., one player@vN, one controller@vN)—the first match it implements from the client's list. The server reports activated roles in active_roles.
Message object keys (e.g., player?, controller?) use unversioned role names. The server determines the appropriate version from the client's active_roles.
Servers should track when clients request roles or role versions they don't implement (excluding those starting with _). This indicates the client supports a newer version of the specification and the server needs to be updated.
Custom roles outside the specification start with _ (e.g., _myapp_controller, _custom_display). Application-specific roles can also be versioned: _myapp_visualizer@v2.
Sendspin has two standard ways to establish connections: Server and Client initiated. Server Initiated connections are recommended as they provide standardized multi-server behavior, but require mDNS which may not be available in all environments.
Sendspin Servers must support both methods described below.
Clients announce their presence via mDNS using:
- Service type:
_sendspin._tcp.local. - Port: The port the Sendspin client is listening on (recommended:
8927) - TXT record:
pathkey specifying the WebSocket endpoint (recommended:/sendspin)
The server discovers available clients through mDNS and connects to each client via WebSocket using the advertised address and path.
Note: Do not manually connect to servers if you are advertising _sendspin._tcp.
In environments with multiple Sendspin servers, servers may need to reconnect to clients when starting playback to reclaim them. The server/hello message includes a connection_reason field indicating whether the server is connecting for general availability ('discovery') or for active/upcoming playback ('playback').
Clients can only be connected to one server at a time. Clients must persistently store the server_id of the server that most recently had playback_state: 'playing' (the "last played server").
When a second server connects, clients must:
-
Accept incoming connections: Complete the handshake (send
client/hello, receiveserver/hello) with the new server before making any decisions. -
Decide which server to keep:
- If the new server's
connection_reasonis'playback'→ switch to new server - If the new server's
connection_reasonis'discovery'and the existing server connected with'playback'→ keep existing server - If both servers have
connection_reason: 'discovery':- Prefer the server matching the stored last played server
- If neither matches (or no history), keep the existing server
- If the new server's
-
Disconnect: Send
client/goodbyewith reason'another_server'to the server being disconnected, then close the connection.
If clients prefer to initiate the connection instead of waiting for the server to connect, the server must be discoverable via mDNS using:
- Service type:
_sendspin-server._tcp.local. - Port: The port the Sendspin server is listening on (recommended:
8927) - TXT record:
pathkey specifying the WebSocket endpoint (recommended:/sendspin)
Clients discover the server through mDNS and initiate a WebSocket connection using the advertised address and path.
Note: Do not advertise _sendspin._tcp if the client plans to initiate the connection.
Unlike server-initiated connections, servers cannot reclaim clients by reconnecting. How clients handle multiple discovered servers, server selection, and switching is implementation-defined.
Note: After this point, Sendspin works independently of how the connection was established. The Sendspin client is always the consumer of data like audio or metadata, regardless of who initiated the connection.
While custom connection methods are possible for specialized use cases (like remotely accessible web-browsers, mobile apps), most clients should use one of the two standardized methods above if possible.
Once the connection is established, Client and Server are going to talk.
The first message must always be a client/hello message from the client to the server.
Once the server receives this message, it responds with a server/hello message. Before this handshake is complete, no other messages should be sent.
WebSocket text messages are used to send JSON payloads.
Note: In field definitions, ? indicates an optional field (e.g., field?: type means the field may be omitted).
All messages have a type field identifying the message and a payload object containing message-specific data. The payload structure varies by message type and is detailed in each message section below.
Message format example:
{
"type": "stream/start",
"payload": {
"player": {
"codec": "opus",
"sample_rate": 48000,
"channels": 2,
"bit_depth": 16
},
"artwork": {
"channels": [
{
"source": "album",
"format": "jpeg",
"width": 800,
"height": 800
}
]
}
}
}WebSocket binary messages are used to send audio chunks, media art, and visualization data. The first byte is a uint8 representing the message type.
Binary message IDs typically use bits 7-2 for role type and bits 1-0 for message slot, allocating 4 IDs per role. Roles with expanded allocations use bits 2-0 for message slot (8 IDs).
Role assignments:
000000xx(0-3): Reserved for future use000001xx(4-7): Player role000010xx(8-11): Artwork role000011xx(12-15): Reserved for a future role00010xxx(16-23): Visualizer role- Roles 6-47 (IDs 24-191): Reserved for future roles
- Roles 48-63 (IDs 192-255): Available for use by application-specific roles
Message slots:
- Slot 0:
xxxxxx00 - Slot 1:
xxxxxx01 - Slot 2:
xxxxxx10 - Slot 3:
xxxxxx11
Roles with expanded allocations have slots 0-7.
Note: Role versions share the same binary message IDs (e.g., player@v1 and player@v2 both use IDs 4-7).
Clients continuously send client/time messages to maintain an accurate offset from the server's clock. The frequency of these messages is determined by the client based on network conditions and clock stability.
Binary audio messages contain timestamps in the server's time domain indicating when the audio should be played. Clients use their computed offset to translate server timestamps to their local clock for synchronized playback.
Note: For microsecond-level synchronization precision, consider using a two-dimensional Kalman filter to track both clock offset and drift. See the time-filter repository for a C++ implementation and aiosendspin for a Python implementation.
- Each client is responsible for maintaining synchronization with the server's timestamps
- Clients maintain accurate sync by adding or removing samples using interpolation to compensate for clock drift
- When a client cannot maintain sync (e.g., buffer underrun), it should send
state: 'error'viaclient/state, mute its audio output, and continue buffering until it can resume synchronized playback, at which point it should sendstate: 'synchronized' - The server is unaware of individual client synchronization accuracy - it simply broadcasts timestamped audio
- The server sends audio to late-joining clients with future timestamps only, allowing them to buffer and start playback in sync with existing clients
- Audio chunks may arrive with timestamps in the past due to network delays or buffering; clients should drop these late chunks to maintain sync
sequenceDiagram
participant Client
participant Server
Note over Client,Server: WebSocket connection established
Note over Client,Server: Text messages = JSON payloads, Binary messages = Audio/Art/Visualization
Client->>Server: client/hello (roles and capabilities)
Server->>Client: server/hello (server info, connection_reason)
Client->>Server: client/state (state: synchronized)
alt Player role
Client->>Server: client/state (player: volume, muted)
end
loop Continuous clock sync
Client->>Server: client/time (client clock)
Server->>Client: server/time (timing + offset info)
end
alt Stream starts
Server->>Client: stream/start (codec, format details)
end
Server->>Client: group/update (playback_state, group_id, group_name)
Server->>Client: server/state (metadata, controller)
loop During playback
alt Player role
Server->>Client: binary Type 4 (audio chunks with timestamps)
end
alt Artwork role
Server->>Client: binary Types 8-11 (artwork channels 0-3)
end
alt Visualizer role
Server->>Client: binary Type 16 (visualization data)
end
end
alt Player requests format change
Client->>Server: stream/request-format (codec, sample_rate, etc)
Server->>Client: stream/start (player: new format)
end
alt Seek operation
Server->>Client: stream/clear (roles: [player, visualizer])
end
alt Controller role
Client->>Server: client/command (controller: play/pause/volume/switch/etc)
end
alt State changes
Client->>Server: client/state (state and/or player changes)
end
alt Server commands player
Server->>Client: server/command (player: volume, mute)
end
Server->>Client: stream/end (ends all role streams)
alt Graceful disconnect
Client->>Server: client/goodbye (reason)
Note over Client,Server: Server initiates disconnect
end
This section describes the fundamental messages that establish communication between clients and the server. These messages handle initial handshakes, ongoing clock synchronization, stream lifecycle management, and role-based state updates and commands.
Every Sendspin client and server must implement all messages in this section regardless of their specific roles. Role-specific object details are documented in their respective role sections and need to be implemented only if the client supports that role.
First message sent by the client after establishing the WebSocket connection. Contains information about the client's capabilities and roles.
This message will be followed by a server/hello message from the server.
Players that can output audio should have the role player.
client_id: string - uniquely identifies the client for groups and de-duplication. Should remain persistent across reconnections so servers can associate clients with previous sessions (e.g., remembering group membership, settings, playback queue)name: string - friendly name of the clientdevice_info?: object - optional information about the deviceproduct_name?: string - device model/product namemanufacturer?: string - device manufacturer namesoftware_version?: string - software version of the client (not the Sendspin version)
version: integer (must be1) - version of the core message format that the Sendspin client implements (independent of role versions)supported_roles: string[] - versioned roles supported by the client (e.g.,player@v1,controller@v1). Defined versioned roles are:player@v1- outputs audiocontroller@v1- controls the current Sendspin groupmetadata@v1- displays text metadata describing the currently playing audioartwork@v1- displays artwork imagesvisualizer@v1- visualizes audio
player@v1_support?: object - only ifplayer@v1is listed (see player@v1 support object details)artwork@v1_support?: object - only ifartwork@v1is listed (see artwork@v1 support object details)visualizer@v1_support?: object - only ifvisualizer@v1is listed (see visualizer@v1 support object details)
Note: Each role version may have its own support object (e.g., player@v1_support, player@v2_support). Application-specific roles or role versions follow the same pattern (e.g., _myapp_display@v1_support, player@_experimental_support).
Sends current internal clock timestamp (in microseconds) to the server.
Once received, the server responds with a server/time message containing timing information to establish clock offsets.
client_transmitted: integer - client's internal clock timestamp in microseconds
Response to the client/hello message with information about the server.
Only after receiving this message should the client send any other messages (including client/time and the initial client/state message if the client has roles that require state updates).
server_id: string - identifier of the servername: string - friendly name of the serverversion: integer (must be1) - version of the core message format that the server implements (independent of role versions)active_roles: string[] - versioned roles that are active for this client (e.g.,player@v1,controller@v1)connection_reason: 'discovery' | 'playback' - only used for server-initiated connectionsdiscovery- server is connecting for general availability (e.g., initial discovery, reconnection after connection loss)playback- server needs client for active or upcoming playback
Note: Servers will always activate the client's preferred version of each role. Checking active_roles is only necessary to detect outdated servers or confirm activation of application-specific roles.
Response to the client/time message with timestamps to establish clock offsets.
For synchronization, all timing is relative to the server's monotonic clock. These timestamps have microsecond precision and are not necessarily based on epoch time.
client_transmitted: integer - client's internal clock timestamp received in theclient/timemessageserver_received: integer - timestamp that the server received theclient/timemessage in microsecondsserver_transmitted: integer - timestamp that the server transmitted this message in microseconds
Client sends state updates to the server. Contains client-level state and role-specific state objects.
Must be sent immediately after receiving server/hello, and whenever any state changes thereafter.
For the initial message, include all state fields. For subsequent updates, only include fields that have changed. The server will merge these updates into existing state.
state: 'synchronized' | 'error' | 'external_source' - operational state of the client'synchronized'- client is operational and synchronized with server timestamps'error'- client has a problem preventing normal operation (unable to keep up, clock sync issues, etc.)'external_source'- client is in use by an external system and is not currently participating in Sendspin playback with this server. See External Source Handling
player?: object - only if client hasplayerrole (see player state object details)
Application-specific roles may also include objects in this message (keys starting with _).
When a client sets state: 'external_source', it indicates the client's output is in use by an external system (e.g., a different audio source, HDMI input, or local media playback) and is not currently participating in Sendspin playback with this server.
If the client is in a multi-client group:
- Remember the client's current group as its "previous group" (see switch command cycle)
- Move the client to a new solo group (stopped)
- Send
group/updatewith the new group information - Send
stream/endfor all active streams
- Send
If the client is already in a solo group:
- Stop playback and send
stream/endfor all active streams
Client sends commands to the server. Contains command objects based on the client's supported roles.
controller?: object - only if client hascontrollerrole (see controller command object details)
Application-specific roles may also include objects in this message (keys starting with _).
Server sends state updates to the client. Contains role-specific state objects.
Only include fields that have changed. The client will merge these updates into existing state. Fields set to null should be cleared from the client's state.
metadata?: object - only sent to clients withmetadatarole (see metadata state object details)controller?: object - only sent to clients withcontrollerrole (see controller state object details)
Application-specific roles may also include objects in this message (keys starting with _).
Server sends commands to the client. Contains role-specific command objects.
player?: object - only sent to clients withplayerrole (see player command object details)
Application-specific roles may also include objects in this message (keys starting with _).
Starts a stream for one or more roles. If sent for a role that already has an active stream, updates the stream configuration without clearing buffers.
player?: object - only sent to clients with theplayerrole (see player object details)artwork?: object - only sent to clients with theartworkrole (see artwork object details)visualizer?: object - only sent to clients with thevisualizerrole (see visualizer object details)
Application-specific roles may also include objects in this message (keys starting with _).
Instructs clients to clear buffers without ending the stream. Used for seek operations.
roles?: string[] - which roles to clear: 'player', 'visualizer', or both. If omitted, clears both roles
Application-specific roles may also be included in this array (names starting with _).
Request different stream format (upgrade or downgrade). Available for clients with the player or artwork role.
player?: object - only for clients with theplayerrole (see player object details)artwork?: object - only for clients with theartworkrole (see artwork object details)
Application-specific roles may also include objects in this message (keys starting with _).
Response: stream/start for the requested role(s) with the new format.
Note: Clients should use this message to adapt to changing network conditions, CPU constraints, or display requirements. The server maintains separate encoding for each client, allowing heterogeneous device capabilities within the same group.
Ends the stream for one or more roles. When received, clients should stop output and clear buffers for the specified roles.
roles?: string[] - roles to end streams for ('player', 'artwork', 'visualizer'). If omitted, ends all active streams
Application-specific roles may also be included in this array (names starting with _).
State update of the group this client is part of.
Contains delta updates with only the changed fields. The client should merge these updates into existing state. Fields set to null should be cleared from the client's state.
playback_state?: 'playing' | 'stopped' - playback state of the groupgroup_id?: string - group identifiergroup_name?: string - friendly name of the group
Sent by the client before gracefully closing the connection. This allows the client to inform the server why it is disconnecting.
Upon receiving this message, the server should initiate the disconnect.
reason: 'another_server' | 'shutdown' | 'restart' | 'user_request'another_server- client is switching to a different Sendspin server. Server should not auto-reconnect but should show the client as available for future playbackshutdown- client is shutting down. Server should not auto-reconnectrestart- client is restarting and will reconnect. Server should auto-reconnectuser_request- user explicitly requested to disconnect from this server. Server should not auto-reconnect
Note: Clients may close the connection without sending this message (e.g., crash, network loss), or immediately after sending client/goodbye without waiting for the server to disconnect. When a client disconnects without sending client/goodbye, servers should assume the disconnect reason is restart and attempt to auto-reconnect.
This section describes messages specific to clients with the player role, which handle audio output and synchronized playback. Player clients receive timestamped audio data, manage their own volume and mute state, and can request different audio formats based on their capabilities and current conditions.
Note: Volume values (0-100) represent perceived loudness, not linear amplitude (e.g., volume 50 should be perceived as half as loud as volume 100). Players must convert these values to appropriate amplitude for their audio hardware.
The player@v1_support object in client/hello has this structure:
player@v1_support: objectsupported_formats: object[] - list of supported audio formats in priority order (first is preferred)codec: 'opus' | 'flac' | 'pcm' - codec identifierchannels: integer - supported number of channels (e.g., 1 = mono, 2 = stereo)sample_rate: integer - sample rate in Hz (e.g., 44100)bit_depth: integer - bit depth for this format (e.g., 16, 24)
buffer_capacity: integer - max size in bytes of compressed audio messages in the buffer that are yet to be playedsupported_commands: string[] - subset of: 'volume', 'mute'
Note: Servers must support all audio codecs: 'opus', 'flac', and 'pcm'.
The player object in client/state has this structure:
Informs the server of player-specific state changes. Only for clients with the player role.
State updates must be sent whenever any state changes, including when the volume was changed through a server/command or via device controls.
player: objectvolume?: integer - range 0-100, must be included if 'volume' is insupported_commandsfromplayer@v1_supportmuted?: boolean - mute state, must be included if 'mute' is insupported_commandsfromplayer@v1_support
The player object in stream/request-format has this structure:
player: objectcodec?: 'opus' | 'flac' | 'pcm' - requested codec identifierchannels?: integer - requested number of channels (e.g., 1 = mono, 2 = stereo)sample_rate?: integer - requested sample rate in Hz (e.g., 44100, 48000)bit_depth?: integer - requested bit depth (e.g., 16, 24)
Response: stream/start with the new format.
Note: Clients should use this message to adapt to changing network conditions or CPU constraints. The server maintains separate encoding for each client, allowing heterogeneous device capabilities within the same group.
The player object in server/command has this structure:
Request the player to perform an action, e.g., change volume or mute state.
player: objectcommand: 'volume' | 'mute' - should be one of the values listed insupported_commandsin theplayer@v1_supportobject in theclient/hellomessage. Commands not insupported_commandsare ignored by the clientvolume?: integer - volume range 0-100, only set ifcommandisvolumemute?: boolean - true to mute, false to unmute, only set ifcommandismute
The player object in stream/start has this structure:
player: objectcodec: string - codec to be usedsample_rate: integer - sample rate to be usedchannels: integer - channels to be usedbit_depth: integer - bit depth to be usedcodec_header?: string - Base64 encoded codec header (if necessary; e.g., FLAC)
When stream/clear includes the player role, clients should clear all buffered audio chunks and continue with chunks received after this message.
Binary messages should be rejected if there is no active stream.
- Byte 0: message type
4(uint8) - Bytes 1-8: timestamp (big-endian int64) - server clock time in microseconds when the first sample should be output
- Rest of bytes: encoded audio frame
The timestamp indicates when the first audio sample in this chunk should be output. Clients must translate this server timestamp to their local clock using the offset computed from clock synchronization. Clients should compensate for any known processing delays (e.g., DAC latency, audio buffer delays, amplifier delays) by accounting for these delays when submitting audio to the hardware.
This section describes messages specific to clients with the controller role, which enables the client to control the Sendspin group this client is part of, and switch between groups.
Every client which lists the controller role in the supported_roles of the client/hello message needs to implement all messages in this section.
The controller object in client/command has this structure:
Control the group that's playing and switch groups. Only valid from clients with the controller role.
controller: objectcommand: 'play' | 'pause' | 'stop' | 'next' | 'previous' | 'volume' | 'mute' | 'repeat_off' | 'repeat_one' | 'repeat_all' | 'shuffle' | 'unshuffle' | 'switch' - should be one of the values listed insupported_commandsfrom theserver/statecontrollerobject. Commands not insupported_commandsare ignored by the servervolume?: integer - volume range 0-100, only set ifcommandisvolumemute?: boolean - true to mute, false to unmute, only set ifcommandismute
- 'play' - resume playback from current position. If nothing is currently playing, the server must try to resume the group's last playing media. This history should persist across server and client reboots
- 'pause' - pause playback at current position
- 'stop' - stop playback and reset position to beginning
- 'next' - skip to next track, chapter, etc.
- 'previous' - skip to previous track, chapter, restart current, etc.
- 'volume' - set group volume (requires
volumeparameter) - 'mute' - set group mute state (requires
muteparameter) - 'repeat_off' - disable repeat mode
- 'repeat_one' - repeat the current track continuously
- 'repeat_all' - repeat all tracks continuously
- 'shuffle' - randomize playback order
- 'unshuffle' - restore original playback order
- 'switch' - move this client to the next group in a predefined cycle as described below
Setting group volume: When setting group volume via the 'volume' command, the server applies the following algorithm to preserve relative volume levels while achieving the requested volume as closely as player boundaries allow:
- Calculate the delta:
delta = requested_volume - current_group_volume(where current group volume is the average of all player volumes) - Apply the delta to each player's volume
- Clamp any player volumes that exceed boundaries (0-100%)
- If any players were clamped:
- Calculate the lost delta:
sum of (proposed_volume - clamped_volume)for all clamped players - Divide the lost delta equally among non-clamped players
- Repeat steps 1-4 until either:
- All delta has been successfully applied, or
- All players are clamped at their volume boundaries
- Calculate the lost delta:
This ensures that when setting group volume to 100%, all players will reach 100% if possible, and the final group volume matches the requested volume as closely as player boundaries allow.
Setting group mute: When setting group mute via the 'mute' command, the server applies the mute state to all players in the group.
Previous group priority: If the client is still in the solo group from its 'external_source' transition, the switch command prioritizes rejoining the previous group.
For clients with the player role, the cycle includes:
- Multi-client groups that are currently playing
- Single-client groups (other players playing alone)
- A solo group containing only this client
For clients without the player role, the cycle includes:
- Multi-client groups that are currently playing
- Single-client groups (other players playing alone)
The controller object in server/state has this structure:
controller: objectsupported_commands: string[] - subset of: 'play' | 'pause' | 'stop' | 'next' | 'previous' | 'volume' | 'mute' | 'repeat_off' | 'repeat_one' | 'repeat_all' | 'shuffle' | 'unshuffle' | 'switch'volume: integer - volume of the whole group, range 0-100muted: boolean - mute state of the whole group
Reading group volume: Group volume is calculated as the average of all player volumes in the group.
Reading group mute: Group mute is true only when all players in the group are muted. If some players are muted and others are not, group mute is false.
This section describes messages specific to clients with the metadata role, which handle display of track information and playback progress. Metadata clients receive state updates with track details.
The metadata object in server/state has this structure:
metadata: objecttimestamp: integer - server clock time in microseconds for when this metadata is validtitle?: string | null - track titleartist?: string | null - primary artist(s)album_artist?: string | null - album artist(s)album?: string | null - name of the album or release that this track belongs toartwork_url?: string | null - URL to artwork image. Useful for clients that want to forward metadata to external systems or for powerful clients that can fetch and process images themselvesyear?: integer | null - release year in YYYY formattrack?: integer | null - track number on the album (1-indexed), null if unknown or not applicableprogress?: object | null - playback progress information. The server must send this object whenever playback state changes (play, pause, resume, seek, playback speed change)track_progress: integer - current playback position in milliseconds since start of tracktrack_duration: integer - total track length in milliseconds, 0 for unlimited/unknown duration (e.g., live radio streams)playback_speed: integer - playback speed multiplier * 1000 (e.g., 1000 = normal speed, 1500 = 1.5x speed, 500 = 0.5x speed, 0 = paused)
repeat?: 'off' | 'one' | 'all' | null - repeat mode: 'off' = no repeat, 'one' = repeat current track, 'all' = repeat all tracks (in the queue, playlist, etc.)shuffle?: boolean | null - shuffle mode enabled/disabled
Clients can calculate the current track position at any time using the timestamp and progress values from the last metadata message that included the progress object:
calculated_progress = metadata.progress.track_progress + (current_time - metadata.timestamp) * metadata.progress.playback_speed / 1000000
if metadata.progress.track_duration != 0:
current_track_progress_ms = max(min(calculated_progress, metadata.progress.track_duration), 0)
else:
current_track_progress_ms = max(calculated_progress, 0)This section describes messages specific to clients with the artwork role, which handle display of artwork images. Artwork clients receive images in their preferred format and resolution.
Channels: Artwork clients can support 1-4 independent channels, allowing them to display multiple related images. For example, a device could display album artwork on one channel while simultaneously showing artist photos or background images on other channels. Each channel operates independently with its own format, resolution, and source type (album or artist artwork).
The artwork@v1_support object in client/hello has this structure:
artwork@v1_support: objectchannels: object[] - list of supported artwork channels (length 1-4), array index is the channel numbersource: 'album' | 'artist' | 'none' - artwork source typeformat: 'jpeg' | 'png' | 'bmp' - image format identifiermedia_width: integer - max width in pixelsmedia_height: integer - max height in pixels
Note: The server will scale images to fit within the specified dimensions while preserving aspect ratio. Clients can support 1-4 independent artwork channels depending on their display capabilities. The channel number is determined by array position: channels[0] is channel 0 (binary message type 4), channels[1] is channel 1 (binary message type 5), etc.
None source: If a channel has source set to none, the server will not send any artwork data for that channel. This allows clients to disable and enable specific channels on the fly through stream/request-format without needing to re-establish the WebSocket connection (useful for dynamic display layouts).
Note: Servers must support all image formats: 'jpeg', 'png', and 'bmp'.
The artwork object in stream/request-format has this structure:
Request the server to change the artwork format for a specific channel. The client can send multiple stream/request-format messages to change formats on different channels.
After receiving this message, the server responds with stream/start for the artwork role with the new format, followed by immediate artwork updates through binary messages.
artwork: objectchannel: integer - channel number (0-3) corresponding to the channel index declared in the artworkclient/hellosource?: 'album' | 'artist' | 'none' - artwork source typeformat?: 'jpeg' | 'png' | 'bmp' - requested image format identifiermedia_width?: integer - requested max width in pixelsmedia_height?: integer - requested max height in pixels
The artwork object in stream/start has this structure:
artwork: objectchannels: object[] - configuration for each active artwork channel, array index is the channel numbersource: 'album' | 'artist' | 'none' - artwork source typeformat: 'jpeg' | 'png' | 'bmp' - format of the encoded imagewidth: integer - width in pixels of the encoded imageheight: integer - height in pixels of the encoded image
Binary messages should be rejected if there is no active stream.
- Byte 0: message type
8-11(uint8) - corresponds to artwork channel 0-3 respectively - Bytes 1-8: timestamp (big-endian int64) - server clock time in microseconds when the image should be displayed by the device
- Rest of bytes: encoded image
The message type determines which artwork channel this image is for:
- Type
8: Channel 0 (Artwork role, slot 0) - Type
9: Channel 1 (Artwork role, slot 1) - Type
10: Channel 2 (Artwork role, slot 2) - Type
11: Channel 3 (Artwork role, slot 3)
The timestamp indicates when this artwork should be displayed. Clients must translate this server timestamp to their local clock using the offset computed from clock synchronization.
Clearing artwork: To clear the currently displayed artwork on a specific channel, the server sends an empty binary message (only the message type byte and timestamp, with no image data) for that channel.
This section describes messages specific to clients with the visualizer role, which create visual representations of the audio being played. Visualizer clients receive audio analysis data like FFT information that corresponds to the current audio timeline.
The visualizer@v1_support object in client/hello has this structure:
visualizer@v1_support: object- Desired FFT details (to be determined)
buffer_capacity: integer - max size in bytes of visualization data messages in the buffer that are yet to be displayed
The visualizer object in stream/start has this structure:
visualizer: object- FFT details (to be determined)
When stream/clear includes the visualizer role, clients should clear all buffered visualization data and continue with data received after this message.
Binary messages should be rejected if there is no active stream.
- Byte 0: message type
16(uint8) - Bytes 1-8: timestamp (big-endian int64) - server clock time in microseconds when the visualization should be displayed by the device
- Rest of bytes: visualization data
The timestamp indicates when this visualization data should be displayed, corresponding to the audio timeline. Clients must translate this server timestamp to their local clock using the offset computed from clock synchronization.