Discussion:
[xiph-rtp] Disambiguation update
Luca Barbato
2005-09-29 09:26:12 UTC
Permalink
The next step is:

- Modify the info header format
- Define the packing layout for offband delivery
- Define inline SDP delivery.

Would worth send just the codebook (and add a bit of complexity) or just
send the full 3header set laced?

Yet to be decided:

- Use a tag or an hash?
- base16 or base64

lu
--
Luca Barbato

Gentoo/linux Developer Gentoo/PPC Operational Leader
http://dev.gentoo.org/~lu_zero

-------------- next part --------------
A non-text attachment was scrubbed...
Name: draft-ietf-avt-vorbis-rtp-00.txt.gz
Type: application/gzip
Size: 12923 bytes
Desc: not available
Url : http://lists.xiph.org/pipermail/xiph-rtp/attachments/20050929/57c70b65/draft-ietf-avt-vorbis-rtp-00.txt-0001.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: update20050929.diff
Type: text/x-patch
Size: 17016 bytes
Desc: not available
Url : http://lists.xiph.org/pipermail/xiph-rtp/attachments/20050929/57c70b65/update20050929-0001.bin
David Barrett
2005-09-29 15:20:56 UTC
Permalink
I'm sorry, I'm confused on this email. Can you summarize:

- Why do we need to modify the info header format, and how?

- By "Define inline SDP delivery" do you mean "Define the SDP syntax
that signals codebooks will be delivered inline via RTP"?

- Where would we use either "base16 or base64"? Maybe in the SDP somewhere?

-david
Post by Luca Barbato
- Modify the info header format
- Define the packing layout for offband delivery
- Define inline SDP delivery.
Would worth send just the codebook (and add a bit of complexity) or just
send the full 3header set laced?
- Use a tag or an hash?
- base16 or base64
lu
------------------------------------------------------------------------
Index: draft-ietf-avt-vorbis-rtp-00.xml
===================================================================
--- draft-ietf-avt-vorbis-rtp-00.xml (revision 9935)
+++ draft-ietf-avt-vorbis-rtp-00.xml (working copy)
@@ -29,7 +29,7 @@
<abstract>
<t>This document describes an RTP payload format for transporting Vorbis encoded audio. It details the RTP encapsulation
mechanism for raw Vorbis data and details the delivery mechanisms for the decoder probability model, referred to as a
-codebook, metadata and other setup information.</t>
+codebook, comments and other setup information.</t>
<t>
Also included within the document are the necessary details for the use of Vorbis with MIME and Session Description Protocol
@@ -223,7 +223,7 @@
<t> 0 = Raw Vorbis payload</t>
<t> 1 = Vorbis Setup payload</t>
<t> 2 = Vorbis Codebook payload</t>
-<t> 3 = Vorbis Metadata payload</t>
+<t> 3 = Vorbis Comment payload</t>
</list>
<t>
@@ -492,37 +492,34 @@
<t>
To decode a Vorbis stream three configuration header blocks are needed. The first header indicates the sample and bitrates, the
number of channels and the version of the Vorbis encoder used. The second header contains the decoders probability model, or
-codebook and the third header details stream metadata.
+codebook and the third header details stream comments.
</t>
<t>
-As the RTP stream may change certain configuration data mid-session there are two different methods for delivering this
-configuration data to a client, in-band and SDP which is detailed below. SDP delivery is used to set-up an initial
-state for the client application and in-band is used to change state during the session. The changes may be due to
-different metadata or codebooks as well as different bitrates of the stream.
+As the RTP stream may change certain configuration data mid-session there are different methods for delivering this configuration data to a client, both in-band and off-band which is detailed below. SDP delivery is used to set-up an initialstate for the client application. The changes may be due to different codebooks as well as different bitrates of the stream.
</t>
<t>
-Out of the two delivery vectors the use of an SDP attribute to indicate an URI where the configuration and codebook data
-can be obtained is preferred as they can be fetched reliably using TCP. The in-band codebook delivery SHOULD
-only be used in situations where the link between the client is unidirectional or if the SDP-based information is not available.
+The delivery vectors in use are specified by an SDP attribute to indicate the method and URI where the configuration and codebook data can be obtained if applies. Different delivery methods COULD be advertized for the same session. The in-band codebook delivery SHOULD be considered as baseline, off-band delivery methods that don't use RTP will not be described in this document. {FIXME add reference to documents where it is described? Remove the last sentence?}
</t>
<t>
-Synchronizing the configuration and codebook headers to the RTP stream is critical. The 32 bit Codebook Ident field is used
-to indicate when a change in the stream has taken place. The client application MUST have in advance the correct configuration
-and codebook headers and if the client detects a change in the Ident value and does not have this information it MUST NOT
-decode the raw Vorbis data.
+Synchronizing the configuration and codebook headers to the RTP stream is critical. The 32 bit Codebook Ident field is used to indicate when a change in the stream has taken place. The client application MUST have in advance the correct configuration and codebook headers and if the client detects a change in the Ident value and does not have this information it MUST NOT decode the raw Vorbis data.
+{FIXME keep the codebook ident field or use a shorter tag field?}
</t>
+<section anchor="Initial SDP Header Setup" title="Initial SDP Header Setup">
+<t>
+The initial configuration MUST be reported in the configuration parameter.
+All the configuration headers known in advance SHOULD be reported in the same way.
+{FIXME better wording, if the tag id will be used instead of the Codebook hash here should go the mappings in the for of list order -> tag id number. Define the metadata/configuration data in a non ambiguous way. Define the format used.}
+
+</t>
+</section>
<section anchor="In-band Header Transmission" title="In-band Header Transmission">
<t>
-The three header data blocks are sent in-band with the packet type bits set to match the payload type. Normally the codebook
-and configuration headers are sent once per session if the stream is an encoding of live audio, as typically
-the encoder state will not change, but the encoder state can change at the boundary of chained Vorbis audio files. Metadata
-can be sent at the start as well as any time during the life of the session. Clients MUST be capable of dealing with periodic
-re-transmission of the configuration headers.
+The three header data blocks are sent in-band with the packet type bits set to match the payload type. Comment packets SHOULD be empty and comment information SHOULD be delivered using SDP. Clients MUST be capable of dealing with periodic re-transmission of the configuration headers.
</t>
<section anchor="Setup Header" title="Setup Header">
@@ -533,6 +530,7 @@
and Num Audio Channels are set in accordance with <xref target="vorbis-spec-ref"></xref> with the bsz fields above referring
to the blocksize parameters. The framing bit is not used for RTP transportation and so applications constructing Vorbis files
MUST take care to set this if required.
+{FIXME The Version header could be just one byte and the channels information is one byte, that way we could spare 4byte}
</t>
<figure anchor="Setup Header Figure" title="Setup Header">
@@ -576,17 +574,16 @@
</t>
<t>
-The configuration information detailed below MUST be completely intact, as a client can not decode a stream with an
-incomplete or corrupted codebook set.
+The configuration information detailed below MUST be completely intact, as a client can not decode a stream with an incomplete or corrupted codebook set.
</t>
<t>
-A 16 bit codebook length field precedes the codebook datablock. The length field allows for codebooks to be up to 64K
-in size. Packet fragmentation, as per the Vorbis data, MUST be performed if the codebooks size exceeds path MTU. The
+A 16 bit codebook length field precedes the codebook datablock. The length field allows for codebooks to be up to 64K in size. Packet fragmentation, as per the Vorbis data, MUST be performed if the codebooks size exceeds path MTU. The
Codebook Ident field MUST be set to match the associated codebook needed to decode the Vorbis stream.
</t>
<t>
+{FIXME: the Codebook Ident should be removed and changed with tag byte, yet to be discussed}
The Codebook Ident is the CRC32 checksum of the codebook and is used to detect a corrupted codebook as well as associating
it with its Vorbis data stream. This Ident value MUST NOT be set to the value of the current stream if this header is being
sent before the boundary of the chained file has been reached. If a checksum failure is detected then this is considered to
@@ -659,14 +656,15 @@
</section>
</section>
-<section anchor="Metadata Header" title="Metadata Header">
+<section anchor="Comment Header" title="Comment Header">
<t>
+{FIXME:I'd make it completely optional}
With the payload type flag set to 3, this indicates that the packet contain the comment metadata, such as artist name, track title
-and so on. These metadata messages are not intended to be fully descriptive but to offer basic track/song information. This
-message MUST be sent at the start of the stream, together with the setup and codebook headers, even if it contains no information.
+and so on. These metadata messages SHOULD be delivered using SDP and are not intended to be fully descriptive but to offer basic track/song information. The
+message MUST be sent at the start of the stream, together with the setup and codebook headers, the Comment packet SHOULD be empty.
During a session the metadata associated with the stream may change from that specified at the start, e.g. a live concert
-broadcast changing acts/scenes, so clients MUST have the ability to receive Metadata header blocks. Details on the format of the
+broadcast changing acts/scenes, so clients MUST have the ability to receive Comment header blocks. Details on the format of the
comments can be found in the Vorbis documentation <xref target="v-comment"></xref>.
</t>
@@ -676,7 +674,7 @@
the comment text.
</t>
-<figure anchor="Metadata Header Figure" title="Metadata Header">
+<figure anchor="Comment Header Figure" title="Comment Header">
<artwork><![CDATA[
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
@@ -715,8 +713,7 @@
<section anchor="Packed Headers Delivery" title="Packed Headers Delivery">
<t>
-As mentioned above the RECOMMENDED delivery vector for Vorbis configuration data is via an SDP attribute as this retrieval method
-can be performed using a reliable transport protocol.
+As mentioned above the RECOMMENDED delivery vector for Vorbis configuration data is via a retrieval method that can be performed using a reliable transport protocol.
</t>
<figure anchor="Packed Headers Overview Figure" title="Packed Headers Overview">
@@ -734,6 +731,7 @@
</figure>
<t>
+{FIXME use a format used already by other containers?}
As the RTP headers are not required for this method of delivery the
structure of the configuration data is slightly different. The packed header starts with a 32 bit count field which details the number of packed headers that are contained in the bundle. Next is the packed header payload for each chained Vorbis file.
</t>
@@ -755,14 +753,16 @@
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
.. Codebook Header |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Metadata Header ..
+ | Comment Header ..
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- .. Metadata Header |
+ .. Comment Header |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
</figure>
-<t>The key difference between the in-band format is there is no need for the payload header octet and Codebook Ident field.
+<t>
+{FIXME Xiphlaced packaging suggested}
+The key difference between the in-band format is there is no need for the payload header octet and Codebook Ident field.
Below are examples of the packed headers format.
</t>
@@ -809,7 +809,7 @@
is normally part of this structure is moved to the second field of the overall packed structure.
</t>
-<figure anchor="Packed Metadata Header Figure" title="Packed Metadata Header">
+<figure anchor="Packed Comment Header Figure" title="Packed Comment Header">
<artwork><![CDATA[
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
@@ -826,8 +826,7 @@
</figure>
<t>
-The packed Metadata header also as a slightly different structure to that of the RTP payload type with the payload header not being used.
-
+The packed Comment header also as a slightly different structure to that of the RTP payload type with the payload header not being used.
</t>
<section anchor="Packed Headers IANA Considerations" title="Packed Headers IANA Considerations">
@@ -855,7 +854,7 @@
<t>
Encoding considerations:</t><t>
-This type is only defined for transfer via HTTP as specified in RFC XXXX.
+This type is only defined for transfer via non RTP protocols as specified in RFC XXXX.
</t>
<t>
@@ -902,6 +901,7 @@
<section anchor="Codebook Caching" title="Codebook Caching">
<t>
+{FIXME tag or hash}
Codebook caching allows clients that have previously connected to a stream to re-use the associated codebooks and configuration
data. When a client receives a codebook it may store it locally and can compare the CRC32 key with that of the new stream and
begin decoding before it has received any of the headers.
@@ -917,13 +917,11 @@
</t>
<t>
-Out of the three headers, loss of either the Codebook or Setup headers MUST result in the halting of stream decoding.
-Loss of the Metadata header SHOULD NOT be regarded as fatal for decoding. Loss of any of the headers SHOULD be reported to the
+Out of the three headers, loss of either the Codebook or Setup headers MUST result in the halting of stream decoding.
+Loss of the Comment header SHOULD NOT be regarded as fatal for decoding. Loss of any of the headers SHOULD be reported to the
client as well as a loss report sent via RTCP.
</t>
-
-
</section>
</section>
@@ -938,12 +936,14 @@
<t>
Required Parameters:</t><t>
-header indicates the URI of the decoding configuration headers.
+delivery-method: indicates the delivery methods in use
+</t><t>
+configuration: a list with the base16 <xref target="rfc3548"></xref> (hexadecimal) representation of the configuration headers.
</t>
<t>
Optional Parameters: </t><t>
-None.
+configuration-uri: the URI of the decoding configuration headers.
</t>
<t>
@@ -1010,23 +1010,25 @@
<t>The parameter "channels" also goes in "a=rtpmap" as channel count.</t>
<vspace blankLines="1" />
-<t>The parameter "header" goes in the SDP "a=fmpt" attribute.</t>
+<t>The mandated parameters "delivery-method" and "configuration" MUST be included in the SDP "a=fmpt" attribute.</t>
+<vspace blankLines="1" />
+
+<t>The optional parameter "configuration-uri", when present, MUST be included in the SDP "a=fmpt" attribute.</t>
+
</list>
-
<t>
-If the stream comprises chained Vorbis files the configuration and codebook headers for each file SHOULD be packaged together
-and passed to the client using the headers attribute if all the files to be played are known in advance.
+If the stream comprises chained Vorbis files the configuration and codebook headers for each file SHOULD be packaged together and passed to the client using the headers attribute if all the files to be played are known in advance.
+{FIXME: define the configuration package, suggested the xiphlaced one that is standard for Matroska and other open source containers}
</t>
<t>
-The Vorbis configuration specified in the header attribute MUST contain all of the configuration data and codebooks needed for
-the life of the session.
+The Vorbis configuration specified in the configuration-uri attribute MUST pointto a location where all of the configuration data and codebooks needed for the life of the session resides.
</t>
<t>
-The port value is specified by the server application bound to the address specified in the c attribute. The bitrate value
-and channels specified in the rtpmap attribute MUST match the Vorbis sample rate value. An example is found below.
+The port value is specified by the server application bound to the address specified in the c attribute. The bitrate value and channels specified in the rtpmap attribute MUST match the Vorbis sample rate value. An example is found below.
+{FIXME add md5sum entry and order in a better way the delivery attribute}
</t>
<vspace blankLines="1" />
@@ -1034,7 +1036,8 @@
<t>c=IN IP4/6 </t>
<t>m=audio RTP/AVP 98</t>
<t>a=rtpmap:98 VORBIS/44100/2</t>
+<t>a=delivery:out_band/http</t>
+<t>a=fmtp:98 delivery-method:out_band/http; configuration=base16string1,base16string2; configuration-uri=http://path/to/the/headers</t>
</list>
<t>
@@ -1045,7 +1048,7 @@
</t>
<t>
-The answer to any offer, <xref target="rfc3264"></xref>, MUST NOT change the URL specified in the header attribute.
+The answer to any offer, <xref target="rfc3264"></xref>, MUST NOT change the URI specified in the configuration-uri attribute.
</t>
</section>
@@ -1169,6 +1172,14 @@
<seriesInfo name="RFC" value="3264" />
</reference>
+<reference anchor="rfc3548">
+<front>
+<title>The Base16, Base32, and Base64 Data Encodings</title>
+<author initials="S." surname="Josefsson" fullname="Simon Josefsson"></author>
+</front>
+<seriesInfo name="RFC" value="3548" />
+</reference>
+
<reference anchor="rtcp-feedback">
<front>
<title>Extended RTP Profile for RTCP-based Feedback (RTP/AVPF)</title>
@@ -1210,7 +1221,7 @@
picture. International Telecommunications Union. Available from the ITU website, http://www.itu.int</title>
</front>
</reference>
-
+
</references>
</back>
</rfc>
------------------------------------------------------------------------
_______________________________________________
xiph-rtp mailing list
http://lists.xiph.org/mailman/listinfo/xiph-rtp
David Barrett
2005-09-29 17:52:52 UTC
Permalink
Post by Luca Barbato
- Use a tag or an hash?
lu
I'm not passionate either way, but I vote hash. My breakdown is as follows:

Tag:
- Pro: one byte
- Con: Limited to 256 entries
- Pro/Con: Centrally organized namespace

Hash:
- Pro: Unlimited entries
- Con: Four bytes
- Pro/Con: No central codebook namespace
- Con: Infinitesimal potential for collision

I'm voting hash purely because of its support for unlimited entries.
And no, I don't have any grand ambitions to use more than 256 entries.
But given how much work we've gone through to allow codebook switching,
I think we go the extra step and make it unlimited.

It'd be sad if we put so much effort into it, only to find out down the
road that there's some killer app -- one we might not be able to
visualize now -- that requires unlimited codebooks. I'll admit, it's
not the most persuasive reasoning, but the potential upside (potential
killer app that only Vorbis/Theora can solve) outweighs the downside
(three extra bytes per packet).

(And as for what this killer app might be, consider a TV station using
Vorbis for audio and Theora for video, broadcast to your handheld
player. As you move throughout the environment your bandwidth might
vary anywhere from 56Kbps to 56Mbps. An ideal server might continuously
vary the encoding rates, continuously optimize the codebooks, crank up
the resolution to HD or down to beans. It could trickle codebooks to
you before changing, etc etc. I'm just tossing out ideas.)

Something that's either good or bad depending on your perspective is the
actual tag namespace management. If we have tags, *someone* has to pick
which codebook is assigned to which tag, and then everyone who gets that
stream (or rebroadcasts it) needs to agree. But if we use hashes, then
nobody needs to pre-decide on a tag/codebook mapping, and this might
enable greater decentralized broadcasting and re-encoding (ie, a source
broadcasts in one and repeaters re-encode in another, and you needn't
re-synchronize your tag index when switching repeaters).

Anyway, as I said, I'm not passionate on the issue, but were I forced to
decide I'd pick hashes for their flexibility and limitlessness.

What's your take?

-david

Loading...