[xiph-rtp] Interesting post on AVT list

Discussion:

Aaron Colwell

2005-09-06 14:32:19 UTC

The following link points to a post by the AVT chair.

http://www1.ietf.org/mail-archive/web/avt/current/msg05831.html

If the 1k limit is only for SAP, then perhaps we should revisit the idea of
putting the codebooks in the SDP. He explicitly calls out RTSP and SIP which
are the 2 main control protocols we are talking about here. Multicast could
still just use the inline transmission mechanism.

This solves the reliable delivery problem since the SDP must be reliably
transmitted from the server to the client. This would require that all
codebooks be known at SDP generation time. I'm pretty sure we've already agreed
to that constraint though.

Comments?

Aaron

Luca Barbato

2005-09-06 16:16:41 UTC

Permalink

Post by Aaron Colwell
If the 1k limit is only for SAP, then perhaps we should revisit the idea of
putting the codebooks in the SDP. He explicitly calls out RTSP and SIP which
are the 2 main control protocols we are talking about here. Multicast could
still just use the inline transmission mechanism.

Only for the cases in which the codebook aren't known before the stream
start. If you pass all the codebooks in the SDP every client will know
it at the session start.

Post by Aaron Colwell
This solves the reliable delivery problem since the SDP must be reliably
transmitted from the server to the client. This would require that all
codebooks be known at SDP generation time. I'm pretty sure we've already agreed
to that constraint though.

We can even relax that constraint allowing offband and/or inband
retransmissions, not sure if it worth the complexity at this point since
most of the current target implementations are covered, I do hope.

Post by Aaron Colwell
Comments?

I suggested that way long ago, but then somebody pointed me the 1k
constraint to SDP, it that doesn't applies to us I'm quite happy.

lu

PS: I'll try to update the rfc having that vector as default and
suggested and I'd let the others as discouraged.

--
Luca Barbato

Gentoo/linux Developer Gentoo/PPC Operational Leader
http://dev.gentoo.org/~lu_zero

David Barrett

2005-09-06 17:23:25 UTC

Permalink

Post by Luca Barbato

Post by Aaron Colwell
This would require that all
codebooks be known at SDP generation time. I'm pretty sure we've
already agreed to that constraint though.

Basically someone needs to decide:

1) Is "chaining" supported in RTP?
2) If it is, can new codebooks be delivered on the fly?
3) Does this decision apply only to Vorbis, or also Theora, etc.

My preference would be "yes" to all three.

Most of the complexity is on the serverside -- ensuring codebooks are
somehow delivered before the client needs them -- and it's the
serverside that chooses if it wants to chain. In other words, the price
is paid by the same developer who makes the decision; all the client
needs to do is maintain a codebook cache and reset the decoder -- not
trivial, but not difficult tasks.

Personally, I think the ability to change the parameters mid-stream
without the overhead of a total stream negotiation is a nice advantage
of the Xiph codecs. This in fact could be a selling point for streaming
situations where high-level adaptive encoding is valuable.

There's also perhaps a fourth question:

4) Will Xiph codecs ever have "fixed" codebooks.

However, it's my impression this has already been decided (and the
answer is "no"), and we're simply grappling with the consequences.

Post by Luca Barbato

Post by Aaron Colwell
Comments?

I suggested that way long ago, but then somebody pointed me the 1k
constraint to SDP, it that doesn't applies to us I'm quite happy.

While I love the idea of using SDP for codebook delivery, I've never
heard of jamming huge base64-encoded binary chunks (ie, 5-10KB) into it
and I'm not sure that's the intent of its designers. For example, are
there parser limits on field sizes that this would break?

Furthermore, the only way SIP (not sure about RTSP) can deliver >MTU
packets is to use TCP. If we were to mandate this we'd in effect state
that Vorbis RTP streams can only be set up in environments where TCP is

Post by Luca Barbato
18.1.1 Sending Requests
...
If a request is within 200 bytes of the path MTU, or if it is larger
than 1300 bytes and the path MTU is unknown, the request MUST be sent
using an RFC 2914 [43] congestion controlled transport protocol, such
as TCP.

(Granted, just above this quote SIP states that TCP "MUST" be supported
for precisely this reason -- to deliver >MTU packets -- but it's my
impression that this is often ignored in the increasingly-common P2P case.)

Post by Luca Barbato
PS: I'll try to update the rfc having that vector as default and
suggested and I'd let the others as discouraged.

I'm fine with stating that vorbis-rtp clients MUST support codebooks via
SDP. However, I'd recommend that vorbis-rtp servers only SHOULD deliver
their codebooks in SDP. This allows for servers to work in pure-UDP
environments without being non-compliant. (They might be non-compliant
with SIP, but that's a separate matter.)

Furthermore, we could state that the inline delivery method SHOULD NOT
be used by the server unless the SDP method is unsuitable. Regardless,
just as clients MUST support the SDP method (even if it's not always
suitable), clients MUST support inline delivery.

And finally, this doesn't prevent any optional codebook delivery
profiles (such as my favorite, inline with acknowledgment) that work on
top of this.

-david

Aaron Colwell

2005-09-07 15:45:35 UTC

Permalink

Post by David Barrett

Post by Luca Barbato

Post by Aaron Colwell
This would require that all
codebooks be known at SDP generation time. I'm pretty sure we've
already agreed to that constraint though.

1) Is "chaining" supported in RTP?

Can we please use a different term than "chaining". Chaining refers to an
Ogg concept that can't be fully supported inside an RTP session. Perhaps
"codebook switching" should be a better term.

My understanding is that we aren't allowing a codec change or a non-integer
sample rate change inside an RTP session. Both of those things are allowed in
a chained file.

Post by David Barrett
2) If it is, can new codebooks be delivered on the fly?

Does this mean a new codebook that wasn't known about at SDP generation time?
If so then this potentially causes a problem where a codebook shows up that
the client doesn't have resources to support. The server may not be in the
position to determine whether the client can support the new codebook or not.
I think the codebooks that are availble for switching should be negotiated up
front.

Post by David Barrett
My preference would be "yes" to all three.
Most of the complexity is on the serverside -- ensuring codebooks are
somehow delivered before the client needs them -- and it's the
serverside that chooses if it wants to chain. In other words, the price
is paid by the same developer who makes the decision; all the client
needs to do is maintain a codebook cache and reset the decoder -- not
trivial, but not difficult tasks.

The client also has to manage it's own resources and only try to play something
it knows it can play. The server is in no position to make this decision.

Post by David Barrett
Personally, I think the ability to change the parameters mid-stream
without the overhead of a total stream negotiation is a nice advantage
of the Xiph codecs. This in fact could be a selling point for streaming
situations where high-level adaptive encoding is valuable.

Many video codecs (H.263+, MPEG4, RV, WMV) at least have this ability without
the need to negotiate different codebooks. It is yet to be proven that there
are significant gains to be had for different codebook selections. That is
a seperate discussion though. I'm only pointing out that this funtionality has
been supported without the need to reliably transmit large chunks of codebook
data.

Post by David Barrett

Post by Luca Barbato
I suggested that way long ago, but then somebody pointed me the 1k
constraint to SDP, it that doesn't applies to us I'm quite happy.

Well part of the reason why I brought this up in the first place was because
the chair of the AVT list suggested it. Technically SDP is the responsibility
of MMUSIC so I guess we should ask them what they think. To my knowledge the
SDP spec never says anything about size limits so if a parser fails, then it
is broken.

Post by David Barrett
Furthermore, the only way SIP (not sure about RTSP) can deliver >MTU
packets is to use TCP. If we were to mandate this we'd in effect state
that Vorbis RTP streams can only be set up in environments where TCP is

It seems strange to me that implementations ignore a MUST inside a standard.
I don't think we should bow to SIP implementations that aren't RFC compliant.
With that said though, I agree that we should probably still have a solution
for UDP.

Post by David Barrett

Post by Luca Barbato
PS: I'll try to update the rfc having that vector as default and
suggested and I'd let the others as discouraged.

Sounds reasonable.

Post by David Barrett
Furthermore, we could state that the inline delivery method SHOULD NOT
be used by the server unless the SDP method is unsuitable. Regardless,
just as clients MUST support the SDP method (even if it's not always
suitable), clients MUST support inline delivery.

Sounds reasonable.

Post by David Barrett
And finally, this doesn't prevent any optional codebook delivery
profiles (such as my favorite, inline with acknowledgment) that work on
top of this.

Sounds fine to me.

Post by David Barrett
-david
_______________________________________________
xiph-rtp mailing list
http://lists.xiph.org/mailman/listinfo/xiph-rtp

David Barrett

2005-09-07 18:35:26 UTC

Permalink

Post by Aaron Colwell
Can we please use a different term than "chaining". Chaining refers to an
Ogg concept that can't be fully supported inside an RTP session. Perhaps
"codebook switching" should be a better term.

Good idea; "codebook switching" is much more accurate and intuitive.

Post by Aaron Colwell

Post by David Barrett
2) If it is, can new codebooks be delivered on the fly?

I certainly agree this is a possibility. I'd say:

Servers SHOULD define all codebooks at SDP generation time. But servers
CAN send new codebooks at any time, and clients SHOULD support all
codebooks if possible. Packets encoded in a codebook the client cannot
support MUST be dropped.

Thus servers seeking maximum compatibility across the broadest range of
devices would pre-negotiate all codebooks up front. But servers that
know their clients can handle the full range of codebooks should have
the option of sending them "just in time".

To give a real-world example, I do live videoconferencing using Theora,
and I'd like to adjust the frame size depending on the size of the
playback window (ie, if it's full screen, send a big feed, if it's a
thumnail, send a small one, and anywhere in between). Thus rather than
pre-generating every possible codebook and sending over up front, I'd
like to generate codebooks "just in time" and send as needed.

Post by Aaron Colwell
The client also has to manage it's own resources and only try to play something
it knows it can play. The server is in no position to make this decision.

True. It's not without cost for the client.

Post by Aaron Colwell

I certainly agree (and I'd like to hear the final logic on this, if
anyone knows it) that codebooks are a *big immediate cost* for only a
potential future gain. Has anyone done a rigorous cost/benefit analysis
demonstrating the value of dynamic codebooks?

Regardless, I'm accepting for the sake of progress that dynamic
codebooks are they way it is.

Post by Aaron Colwell
It seems strange to me that implementations ignore a MUST inside a standard.
I don't think we should bow to SIP implementations that aren't RFC compliant.
With that said though, I agree that we should probably still have a solution
for UDP.

I'll ask around the SIP/P2P community and see if my initial impression
is right. As for the SIP/TCP requirement, that was actually added quite
late (it was originally optional) so I'm not sure to what degree very
old or very new implementations support it.

-david

Aaron Colwell

2005-09-07 22:40:58 UTC

Permalink

Post by David Barrett

Post by Aaron Colwell

Post by David Barrett
2) If it is, can new codebooks be delivered on the fly?

Servers SHOULD define all codebooks at SDP generation time. But servers
CAN send new codebooks at any time, and clients SHOULD support all
codebooks if possible. Packets encoded in a codebook the client cannot
support MUST be dropped.
Thus servers seeking maximum compatibility across the broadest range of
devices would pre-negotiate all codebooks up front. But servers that
know their clients can handle the full range of codebooks should have
the option of sending them "just in time".

This sounds reasonable to me.

Post by David Barrett
To give a real-world example, I do live videoconferencing using Theora,
and I'd like to adjust the frame size depending on the size of the
playback window (ie, if it's full screen, send a big feed, if it's a
thumnail, send a small one, and anywhere in between). Thus rather than
pre-generating every possible codebook and sending over up front, I'd
like to generate codebooks "just in time" and send as needed.

This actually brings up another "issue". In the case you described above
you technically only need to update the ident header and could use the same
codebook. Do we want to provide a way of indicating that some headers for a
new "codebook ID" are the same as another "codebook ID". The way we have things
now the answer seems to be no. I'm only asking because you could potentially
avoid sending the codebook over and over again if you decide to keep changing
the frame size.

Post by David Barrett
I certainly agree (and I'd like to hear the final logic on this, if
anyone knows it) that codebooks are a *big immediate cost* for only a
potential future gain. Has anyone done a rigorous cost/benefit analysis
demonstrating the value of dynamic codebooks?

I agree. I'd be interested in seeing hard stats on what changes have been made
to the Vorbis codebooks since it was initially release and how much gain
has been realized.

Post by David Barrett
Regardless, I'm accepting for the sake of progress that dynamic
codebooks are they way it is.

Me too.

Aaron

Ralph Giles

2005-09-07 23:35:37 UTC

Permalink

Post by Aaron Colwell
This actually brings up another "issue". In the case you described above
you technically only need to update the ident header and could use the same
codebook.

That's what's nice about the way the codec design treats the three
headers as a unit, as in Ogg chaining. Splitting them up to allow
independent switching, or because one or two are much smaller than
the others, has a design cost.

Post by Aaron Colwell
I agree. I'd be interested in seeing hard stats on what changes have been made
to the Vorbis codebooks since it was initially release and how much gain
has been realized.

The general figure quoted is a factor of 2 in bitrate at the same
quality.

-r

Tor-Einar Jarnbjo

2005-09-06 18:24:38 UTC

Permalink

Yes, Colin was right that the 1k size limit for the SDP descriptor is
not related to RTSP or SIP. However, SDP descriptors are usually _not_
reliably transmitted. Embedding the codebook header in the SDP means
that the encapsulating UDP packet will exceed the path MTU for all
common Vorbis streams and/or network situations and you will also empose
a length restriction on the codebook header to somewhere around 65000
bytes. For Vorbis this might not be a great problem. What about Theora?
So still, my objections against unreliable codebook delivery are valid.
If we put the codebook into the SDP, we will not even have a possibility
to transmit the codebook fragments twice, just to assume that the client
will have received at least one of each fragment.

Tor

Luca Barbato

2005-09-06 18:53:40 UTC

Permalink

Post by Tor-Einar Jarnbjo
Yes, Colin was right that the 1k size limit for the SDP descriptor is
not related to RTSP or SIP. However, SDP descriptors are usually _not_
reliably transmitted. Embedding the codebook header in the SDP means
that the encapsulating UDP packet will exceed the path MTU for all
common Vorbis streams and/or network situations and you will also empose
a length restriction on the codebook header to somewhere around 65000
bytes. For Vorbis this might not be a great problem. What about Theora?
So still, my objections against unreliable codebook delivery are valid.
If we put the codebook into the SDP, we will not even have a possibility
to transmit the codebook fragments twice, just to assume that the client
will have received at least one of each fragment.

Lose the session description and you can't start the session.

Which method would deliver the SDP unreliably?

lu

--
Luca Barbato

Gentoo/linux Developer Gentoo/PPC Operational Leader
http://dev.gentoo.org/~lu_zero

Tor-Einar Jarnbjo

2005-09-06 19:29:31 UTC

Permalink

Post by Luca Barbato
Lose the session description and you can't start the session.

Exactly, and no media server would appreciate each client trying to
setup a session several times because it's not able to deliver a
response to the client's RTSP or SIP request.

Post by Luca Barbato
Which method would deliver the SDP unreliably?

RTSP and SIP, isn't that what we're talking about? Both methods usually
use UDP transport.

Tor

Aaron Colwell

2005-09-06 19:38:26 UTC

Permalink

Post by Tor-Einar Jarnbjo

Post by Luca Barbato
Lose the session description and you can't start the session.

Exactly, and no media server would appreciate each client trying to
setup a session several times because it's not able to deliver a
response to the client's RTSP or SIP request.

The server wouldn't try to setup the session several times. The server is able
to recognize when the receiver is resending a previous request. It would need
to support this functionality anyways if it decided to allow RTSP or SIP
requests over UDP.

Post by Tor-Einar Jarnbjo

Post by Luca Barbato
Which method would deliver the SDP unreliably?

RTSP and SIP, isn't that what we're talking about? Both methods usually
use UDP transport.

RTSP does NOT usually use UDP. All servers that I am aware of use TCP. Even
if UDP was used for some strange reason, the client can issue the request again
if it doesn't receive the response.

I'm just curious. Have you actually ever implemented anything that uses either
of these protocols? Your claims lead me to believe that you haven't.

Aaron

Post by Tor-Einar Jarnbjo
Tor
_______________________________________________
xiph-rtp mailing list
http://lists.xiph.org/mailman/listinfo/xiph-rtp

Tor-Einar Jarnbjo

2005-09-06 19:56:12 UTC

Permalink

Post by Aaron Colwell
I'm just curious. Have you actually ever implemented anything that uses either
of these protocols? Your claims lead me to believe that you haven't.

Yes, I have. I have never experienced SIP to be used over anything but
UDP and the few RTSP implementations I've been dealing with have
supported both UDP and TCP.

Tor

David Barrett

2005-09-06 20:40:55 UTC

Permalink

Post by Tor-Einar Jarnbjo

Post by Luca Barbato
Which method would deliver the SDP unreliably?

RTSP and SIP, isn't that what we're talking about? Both methods usually
use UDP transport.

Well, TCP is only reliable because it layers retransmissions atop IP.
SIP is likewise reliable because it layers retransmissions atop UDP.

So SDP is delivered reliably by SIP, because SIP has reliability built
into it.

Regardless, sending any SIP packet using UDP bigger than the MTU will
probably still fail despite SIP's reliability measures because SIP has
no fragmentation ability (that I know of -- can anyone correct me
here?), and fragmented UDP is sometimes blocked by
NATs/routers/firewalls. That's why SIP states anything over the MTU
should be delivered using TCP.

-david