Discussion:
[xiph-rtp] First tentative update
Luca Barbato
2005-09-04 18:44:37 UTC
Permalink
Is incomplete and probably wrong, I'm still swamped in exams but having
something to examine and fix could be a bit better than to dig
everything from long threads.

Comments and critiques welcome as usual

lu
--
Luca Barbato

Gentoo/linux Developer Gentoo/PPC Operational Leader
http://dev.gentoo.org/~lu_zero

-------------- next part --------------
Index: draft-ietf-avt-vorbis-rtp-00.xml
===================================================================
--- draft-ietf-avt-vorbis-rtp-00.xml (revision 9935)
+++ draft-ietf-avt-vorbis-rtp-00.xml (working copy)
@@ -496,20 +496,16 @@
</t>

<t>
-As the RTP stream may change certain configuration data mid-session there are two different methods for delivering this
-configuration data to a client, in-band and SDP which is detailed below. SDP delivery is used to set-up an initial
-state for the client application and in-band is used to change state during the session. The changes may be due to
-different metadata or codebooks as well as different bitrates of the stream.
+As the RTP stream may change certain configuration data mid-session there are different methods for delivering this configuration data to a client, both in-band and off-band which is detailed below. SDP delivery is used to set-up an initialstate for the client application. The changes may be due to different codebooks as well as different bitrates of the stream.
</t>

<t>
-Out of the two delivery vectors the use of an SDP attribute to indicate an URI where the configuration and codebook data
-can be obtained is preferred as they can be fetched reliably using TCP. The in-band codebook delivery SHOULD
-only be used in situations where the link between the client is unidirectional or if the SDP-based information is not available.
+The delivery vectors in use are specified by an SDP attribute to indicate the method and URI where the configuration and codebook data
+can be obtained if applies. Different delivery methods COULD be advertized for the same session. The in-band codebook delivery SHOULD be considered as baseline, off-band delivery methods that don't use RTP will not be described in this document. {FIXME add reference to documents where it is described? Remove the last sentence?}
</t>

<t>
-Synchronizing the configuration and codebook headers to the RTP stream is critical. The 32 bit Codebook Ident field is used
+Synchronizing the configuration and codebook headers to the RTP stream is critical. The 32 bit Codebook Ident field is used
to indicate when a change in the stream has taken place. The client application MUST have in advance the correct configuration
and codebook headers and if the client detects a change in the Ident value and does not have this information it MUST NOT
decode the raw Vorbis data.
@@ -518,11 +514,7 @@
<section anchor="In-band Header Transmission" title="In-band Header Transmission">

<t>
-The three header data blocks are sent in-band with the packet type bits set to match the payload type. Normally the codebook
-and configuration headers are sent once per session if the stream is an encoding of live audio, as typically
-the encoder state will not change, but the encoder state can change at the boundary of chained Vorbis audio files. Metadata
-can be sent at the start as well as any time during the life of the session. Clients MUST be capable of dealing with periodic
-re-transmission of the configuration headers.
+The three header data blocks are sent in-band with the packet type bits set to match the payload type. Metadata packets SHOULD be empty and metadata information SHOULD be delivered using SDP. Clients MUST be capable of dealing with periodic re-transmission of the configuration headers.
</t>

<section anchor="Setup Header" title="Setup Header">
@@ -533,6 +525,7 @@
and Num Audio Channels are set in accordance with <xref target="vorbis-spec-ref"></xref> with the bsz fields above referring
to the blocksize parameters. The framing bit is not used for RTP transportation and so applications constructing Vorbis files
MUST take care to set this if required.
+{FIXME The Version header could be just one byte and the channels information is one byte, that way we could spare 4byte}
</t>

<figure anchor="Setup Header Figure" title="Setup Header">
@@ -587,6 +580,7 @@
</t>

<t>
+{FIXME: the Codebook Ident should be removed and changed with tag byte, yet to be discussed}
The Codebook Ident is the CRC32 checksum of the codebook and is used to detect a corrupted codebook as well as associating
it with its Vorbis data stream. This Ident value MUST NOT be set to the value of the current stream if this header is being
sent before the boundary of the chained file has been reached. If a checksum failure is detected then this is considered to
@@ -662,9 +656,10 @@
<section anchor="Metadata Header" title="Metadata Header">

<t>
+{FIXME:I'd make it completely optional}
With the payload type flag set to 3, this indicates that the packet contain the comment metadata, such as artist name, track title
-and so on. These metadata messages are not intended to be fully descriptive but to offer basic track/song information. This
-message MUST be sent at the start of the stream, together with the setup and codebook headers, even if it contains no information.
+and so on. These metadata messages SHOULD be delivered using SDP and are not intended to be fully descriptive but to offer basic track/song information. The
+message MUST be sent at the start of the stream, together with the setup and codebook headers, the Metadata packet SHOULD be empty.
During a session the metadata associated with the stream may change from that specified at the start, e.g. a live concert
broadcast changing acts/scenes, so clients MUST have the ability to receive Metadata header blocks. Details on the format of the
comments can be found in the Vorbis documentation <xref target="v-comment"></xref>.
@@ -715,8 +710,7 @@
<section anchor="Packed Headers Delivery" title="Packed Headers Delivery">

<t>
-As mentioned above the RECOMMENDED delivery vector for Vorbis configuration data is via an SDP attribute as this retrieval method
-can be performed using a reliable transport protocol.
+As mentioned above the RECOMMENDED delivery vector for Vorbis configuration data is via a retrieval method that can be performed using a reliable transport protocol.
</t>

<figure anchor="Packed Headers Overview Figure" title="Packed Headers Overview">
@@ -734,6 +728,7 @@
</figure>

<t>
+{FIXME use a metadata format used already by other containers?}
As the RTP headers are not required for this method of delivery the
structure of the configuration data is slightly different. The packed header starts with a 32 bit count field which details the number of packed headers that are contained in the bundle. Next is the packed header payload for each chained Vorbis file.
</t>
@@ -855,7 +850,7 @@

<t>
Encoding considerations:</t><t>
-This type is only defined for transfer via HTTP as specified in RFC XXXX.
+This type is only defined for transfer via non RTP protocols as specified in RFC XXXX.
</t>

<t>
@@ -902,6 +897,7 @@
<section anchor="Codebook Caching" title="Codebook Caching">

<t>
+{FIXME}
Codebook caching allows clients that have previously connected to a stream to re-use the associated codebooks and configuration
data. When a client receives a codebook it may store it locally and can compare the CRC32 key with that of the new stream and
begin decoding before it has received any of the headers.
@@ -938,12 +934,12 @@

<t>
Required Parameters:</t><t>
-header indicates the URI of the decoding configuration headers.
+delivery indicates the delivery methods in use
</t>

<t>
Optional Parameters: </t><t>
-None.
+header indicates the URI of the decoding configuration headers.
</t>

<t>
@@ -1010,7 +1006,11 @@
<t>The parameter "channels" also goes in "a=rtpmap" as channel count.</t>
<vspace blankLines="1" />

+<t>The parameter "delivery" goes in the SDP "a=delivery" attribute.</t>
+<vspace blankLines="1" />
+
<t>The parameter "header" goes in the SDP "a=fmpt" attribute.</t>
+
</list>


@@ -1027,6 +1027,7 @@
<t>
The port value is specified by the server application bound to the address specified in the c attribute. The bitrate value
and channels specified in the rtpmap attribute MUST match the Vorbis sample rate value. An example is found below.
+{FIXME add md5sum entry and order in a better way the delivery attribute}
</t>

<vspace blankLines="1" />
@@ -1034,7 +1035,8 @@
<t>c=IN IP4/6 </t>
<t>m=audio RTP/AVP 98</t>
<t>a=rtpmap:98 VORBIS/44100/2</t>
-<t>a=fmtp:98 header=&lt;URL of configuration header&gt; </t>
+<t>a=delivery:out_band/http
+<t>a=fmtp:98 header=http://path/to/the/headers</t>
</list>

<t>
@@ -1045,7 +1047,7 @@
</t>

<t>
-The answer to any offer, <xref target="rfc3264"></xref>, MUST NOT change the URL specified in the header attribute.
+The answer to any offer, <xref target="rfc3264"></xref>, MUST NOT change the URI specified in the header attribute.
</t>

</section>
David Barrett
2005-09-04 19:05:44 UTC
Permalink
Looks good to me, but can you post a link to the latest full draft?
Post by Luca Barbato
Is incomplete and probably wrong, I'm still swamped in exams but having
something to examine and fix could be a bit better than to dig
everything from long threads.
Comments and critiques welcome as usual
lu
--
Luca Barbato
Gentoo/linux Developer Gentoo/PPC Operational Leader
http://dev.gentoo.org/~lu_zero
Luca Barbato
2005-09-04 19:43:32 UTC
Permalink
Post by David Barrett
Looks good to me, but can you post a link to the latest full draft?
Sorry

http://svn.xiph.org/trunk/vorbis/doc/draft-ietf-avt-vorbis-rtp-00.xml

lu
--
Luca Barbato

Gentoo/linux Developer Gentoo/PPC Operational Leader
http://dev.gentoo.org/~lu_zero
Tor-Einar Jarnbjo
2005-09-04 21:36:12 UTC
Permalink
Post by Luca Barbato
http://svn.xiph.org/trunk/vorbis/doc/draft-ietf-avt-vorbis-rtp-00.xml
For easier reading, I've moved my txt and html converters to the new SVN
file:

http://www.j-ogg.de/rfc/vorbis-rtp.txt
http://www.j-ogg.de/rfc/vorbis-rtp.html

Tor
David Barrett
2005-09-05 16:22:42 UTC
Permalink
I think it looks great. A couple questions:

First, a general one. There seems to be an extremely high degree of
similarity between at least the Theora and Vorbis RTP formats. To what
degree do you believe it's desirable to maintain this similarity? At
the most extreme case, I could imagine a "Baseline Xiph RTP" spec that
has profiles for each Xiph codec. Is this interesting to anyone?
If we use the fragmented Vorbis
packet example above and the first packet is lost the client SHOULD
detect that the next packet has the packet count field set to 0 and
the C bit is set and MUST drop it. The next packet, which is the
final fragmented packet, SHOULD be dropped in the same manner, or
buffered. Feedback reports on lost and dropped packets MUST be sent
back via RTCP.
In the situation where there is a three-packet fragmented payload, and
the first packet is dropped, why MUST the second be dropped as well, but
the third only SHOULD be dropped? I assumed if *any* fragment of a
packet is dropped then *all* must be dropped. Why not?


Section 4 on Configuration Headers
Out of the two delivery vectors the use of an SDP attribute to
indicate an URI where the configuration and codebook data can be
obtained is preferred as they can be fetched reliably using TCP. The
in-band codebook delivery SHOULD only be used in situations where the
link between the client is unidirectional or if the SDP-based
information is not available.
Can we add another case to the last sentence such as:

"The in-band codebook delivery SHOULD only be used in situations where
the link between the client is unidirectional, SDP-based information is
not available, or TCP connections cannot be used."
Synchronizing the configuration and codebook headers to the RTP
stream is critical. The 32 bit Codebook Ident field is used to
indicate when a change in the stream has taken place. The client
application MUST have in advance the correct configuration and
codebook headers and if the client detects a change in the Ident
value and does not have this information it MUST NOT decode the raw
Vorbis data.
First, by "change in the stream" does this imply a "chain boundary"?

Also, to accomodate the periodic retransmission of codebooks, perhaps
change the last sentence to:

"The client application MUST NOT decode the raw Vorbis data for packets
encoded with codebooks the client has not yet obtained."


Section 4.1 on In-Band Header Transmission
The three header data blocks are sent in-band with the packet type
bits set to match the payload type. Normally the codebook and
configuration headers are sent once per session if the stream is an
encoding of live audio, as typically the encoder state will not
change, but the encoder state can change at the boundary of chained
Vorbis audio files. Metadata can be sent at the start as well as any
time during the life of the session. Clients MUST be capable of
dealing with periodic re-transmission of the configuration headers.
By saying "the encoder state can change at the boundary of chained
files", how precisely is this interpreted? If I know I'm going to
switch to a new chain (ie, begin broadcasting raw payloads unsing a new
codebook ID) in 10 seconds, can I send my codebooks over in advance?
Or, need I wait until I finish with one chain before even broadcasting
the codebooks for the next?

In other words, I didn't see "chaining" explicitly defined. Am I
correct to understand it as any sequential subset of raw payloads using
the same codebook ID comprise a single chain? Can I broadcast the
headers for one chain within the boundaries of another?

Also, by "Metadata can be sent..." does this mean I can change the
metadata even within a single chain?
If the stream comprises chained Vorbis files the configuration and
codebook headers for each file SHOULD be packaged together and passed
to the client using the headers attribute if all the files to be
played are known in advance.
The Vorbis configuration specified in the header attribute MUST
contain all of the configuration data and codebooks needed for the
life of the session.
I read the first paragraph to mean that I SHOULD know all codebooks in
advance (but needn't), whereas the second states that I MUST. Which is it?


Thanks Luca, looks great!

-david

Loading...