AirTunes 2 protocol
===================
.. contents:: :depth: 4
Introduction
------------
TODO
In the examples below, values to be replaced are put into curly
braces ("{}"). The braces should not be included after replacing
the values.
Credits
-------
* `Apple Inc. `_
* `Rogue Amoeba Software, LLC `_
Preferred TCP/UDP ports
-----------------------
=========== ====
Connection Port
=========== ====
RTSP 5000
Audio data 6000
RTP control 6001
Timing 6002
=========== ====
Payload types
-------------
=============== ====
Timing request 0x52
Timing response 0x53
Sync 0x54
Range resend 0x55
=============== ====
Data types
----------
When transferred over the network, multi-byte values need to converted
to network byte order. No aligning must be used within the packet
structures.
RtpHeader
~~~~~~~~~
::
/* RTP header bits */
RTP_HEADER_A_EXTENSION = 0x10;
RTP_HEADER_A_SOURCE = 0x0f;
RTP_HEADER_B_PAYLOAD_TYPE = 0x7f;
RTP_HEADER_B_MARKER = 0x80;
/* sizeof(RtpHeader) == 4 */
RtpHeader {
uint8_t a;
uint8_t b;
uint16_t seqnum;
/* extension = bool(a & RTP_HEADER_A_EXTENSION) */
/* source = a & RTP_HEADER_A_SOURCE */
/* payload_type = b & RTP_HEADER_B_PAYLOAD_TYPE */
/* marker = bool(b & RTP_HEADER_B_MARKER) */
}
RtpTime
~~~~~~~
::
/* sizeof(RtpTime) == 8 */
struct RtpTime {
/* Seconds since 1900-01-01 00:00:00 (TODO: Timezone?) */
uint32_t integer;
/* Fraction of second (0..2^32) */
uint32_t fraction;
}
TimingPacket
~~~~~~~~~~~~
::
/* sizeof(TimingPacket) == 32 */
struct TimingPacket {
RtpHeader header;
RtpTime timestamp;
RtpTime reference_time;
RtpTime received_time;
RtpTime send_time;
}
SyncPacket
~~~~~~~~~~
::
/* sizeof(SyncPacket) == 20 */
struct SyncPacket {
RtpHeader header;
uint32_t timestamp;
RtpTime some_time;
uint32_t next_timestamp;
}
ResendPacket
~~~~~~~~~~~~
::
/* sizeof(RtpResendPacket) == 8 */
struct RtpResendHeader {
RtpHeader header;
uint16_t missed_seqnum;
uint16_t count;
}
RTSP
----
Common request headers
~~~~~~~~~~~~~~~~~~~~~~
.. _rtp-info:
================ =================================================
Client-Instance | 64 random bytes in hex. Must be unique per
connection.
CSeq | Request sequence number. Can either be counted
locally or response sequence number can be
increased by one.
RTP-Info ``rtptime={RTP timestamp}``
Session Server session ID (after SETUP)
User-Agent | ``iTunes/{Version} (Windows; N;)``
(e.g. Version=``7.6.2``)
================ =================================================
Request URI
~~~~~~~~~~~
Unless specified otherwise, ``rtsp://{Local IP address}/{Client session ID}``
must be used as the request URI. The client session ID is a random number
between 0 and 2^32.
ANNOUNCE
~~~~~~~~
======= ===========================================================
Headers | ``Content-Type: application/sdp``
Body | ``v=0\r\n``
| ``o=iTunes {Client session ID} O IN IP4 {Local IP address}\r\n``
| ``s=iTunes\r\n``
| ``c=IN IP4 {Server IP address}\r\n``
| ``t=0 0\r\n``
| ``m=audio 0 RTP/AVP 96\r\n``
| ``a=rtpmap:96 AppleLossless\r\n``
| ``a=fmtp:96 {Frames per packet} 0 16 40 10 14 2 255 0 0 44100\r\n``
| ``a=rsaaeskey:{AES key in base64 w/o padding}\r\n``
| ``a=aesiv:{AES IV in base64 w/o padding}\r\n``
| ``\r\n``
======= ===========================================================
FLUSH
~~~~~
======= =============================================
Headers ``RTP-Info: seq={Last RTP seqnum};rtptime=0``
======= =============================================
OPTIONS
~~~~~~~
======= ============================================================
URI ``*``
Headers ``Apple-Challenge: {16 random bytes in base64 w/o padding}``
======= ============================================================
RECORD
~~~~~~
======= =========================================
Headers | ``Range: ntp={Note 1}``
| ``RTP-Info: seq={Note 2};rtptime={Note 3}``
======= =========================================
Note 1: Normal play time (apparently always 0), float, >=0. (TODO)
Note 2: Apparently a random number between 0 and 8192. (TODO)
Note 3: Apparently always zero. (TODO)
SET_PARAMETER
~~~~~~~~~~~~~
Setting volume
``````````````
======= =================================
Headers ``Content-Type: text/parameters``
Body ``volume: %f``
======= =================================
Volume is either -144.0 (muted) or (-30.0)..(0.0).
Set progress
````````````
======= =================================
Headers ``Content-Type: text/parameters``
Body ``progress: %f/%f/%f``
======= =================================
Values are RTP timestamp as unsigned integers (TODO).
Set DAAP metadata
`````````````````
======= =================================
Headers | ``Content-Type: application/x-dmap-tagged``
| RTP-Info_
Body DAAP metadata
======= =================================
SETUP
~~~~~
======= ====================================================
Headers ``Transport: RTP/AVP/UDP;unicast;interleaved=0-1;mode=record;control_port={Control port};timing_port={Timing port}``
======= ====================================================
Get ``server_port``, ``control_port`` and ``timing_port`` from ``Transport``
response header. Get ``Session`` response header and use it as server session ID.
TEARDOWN
~~~~~~~~
Nothing special.
Rogue Amoeba extensions
~~~~~~~~~~~~~~~~~~~~~~~
X_RA_SET_ALBUM_ART
``````````````````
Use this only if server wants PList metadata. Use the ``SET_PARAMETER``
method if DAAP metadata is requested.
======= ========================================
Headers | ``Content-Type: {Image content type}``
| RTP-Info_
Body Image data
======= ========================================
X_RA_SET_PLIST_METADATA
```````````````````````
======= ===================================
Headers | ``Content-Type: application/xml``
| RTP-Info_
Body Metadata in PList format
======= ===================================
Detect speaker type
~~~~~~~~~~~~~~~~~~~
If ``Audio-Jack-Status`` is in response:
::
speaker_type() {
if ("disconnected" in Audio-Jack-Status) {
return unplugged;
} else if ("connected" in Audio-Jack-Status) {
if ("digital" in Audio-Jack-Status) {
return digital;
}
return analog;
}
return unknown;
}
Detect metadata and audio latency
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If ``Apple-Response``, ``Server`` or ``Audio-Latency`` in response:
::
metadata_type() {
if (Apple-Response in response) {
lowercase_password = False;
audio_format = EncryptedALAC;
wants_album_art = False;
wants_metadata = False;
wants_progress = False;
has_bad_latency_header = False;
}
if (Server in response) {
lowercase_password = True;
has_bad_latency_header = True;
if (not Apple-Response in response) {
audio_format = UnencryptedALAC;
wants_album_art = DAAP;
wants_metadata = DAAP;
wants_progress = True;
}
}
if (Audio-Latency in response) {
if (not has_bad_latency_header) {
audio_latency = Audio-Latency;
} else {
if (Audio-Latency == 322 or
Audio-Latency == 15049) {
audio_latency = 11025;
}
/* Why always 11025? */
audio_latency = 11025;
}
}
}
Timing
------
Replying to timing packet
~~~~~~~~~~~~~~~~~~~~~~~~~
::
on_timing_packet(TimingPacket req) {
assert req.header.payload_type == PAYLOAD_TIMING_REQUEST;
TimingPacket res;
res.header = req.header;
res.header.payload_type = PAYLOAD_TIMING_RESPONSE;
res.reftime = req.send_time;
res.received_time = time_now();
res.send_time = time_now();
send(res);
}
Sync
----
Sync packets are sent once per second or when adding a speaker.
Sending sync packet
~~~~~~~~~~~~~~~~~~~
::
send_sync(uint32_t timestamp, bool first) {
SyncPacket packet;
packet.header.payload_type = PAYLOAD_SYNC;
packet.header.marker = True;
packet.header.seqnum = 7; /* Why fixed? */
if (first) {
packet.header.extension = True;
}
packet.now_timestamp = /* TODO */;
packet.next_timestamp = timestamp;
packet.some_time = /* TODO */;
}
Metadata
--------
DAAP metadata
~~~~~~~~~~~~~
=============== =============================
Content-type ``application/x-dmap-tagged``
Item name field ``dmap.itemname``
Artist field ``daap.songartist``
Album field ``daap.songalbum``
=============== =============================
PList metadata
~~~~~~~~~~~~~~
=============== =============================
Content-type ``application/xml``
Title field ``title``
Artist field ``artist``
Album field ``album``
=============== =============================
Zeroconf TXT record
-------------------
======= =======================================================
Field Description
======= =======================================================
txtvers TXT record version (always ``1``)
pw ``true`` if password required, ``false`` otherwise
sr Audio sample rate
ss Audio bit rate
ch Number of audio channels
tp Protocol (``UDP`` [TODO: or ``TCP``?])
======= =======================================================
Rogue Amoeba extensions
~~~~~~~~~~~~~~~~~~~~~~~
============== =======================================
Field Description
============== =======================================
rast ``afs`` if Airfoil speaker
ramach ``{Platform name}.{OS major version}``
raver Library version
raAudioFormats TODO
============== =======================================
Other numbers
-------------
======================== =======================
Audio frames per packet 352
Shorts per packet (TODO) 704
Timestamps per second 44100
Time sync interval 44100 (once per second)
Recovery buffer size 1000 packets
======================== =======================