[xiph-rtp] Codebook transmission

Discussion:

Ralph Giles

2005-08-29 17:59:06 UTC

Aaron and I also talked quite a bit about codebook transmission today.

A quick summary:

Things are simplified my having only one set to transmit.

First, the decoder MUST handle headers sent inline in the RTP stream.

Aaron also suggested that all the info header parameters (sample rate,
frame size, number of channels and so on) be mirrored in the SDP as
well. Sounds good to me.

For the single-forward channel (broadcast) application, they just have
to be baked in, or use some other transmission mechanism outside the
scope of our spec. Without the chain ID, you can't switch among defined
codebooks though, unless you rely on inline transmission.

For IP unicast, we talked about http download. Something over RTSP would
presumedly also work. So you'd have:

a:fmtp codebook-url=http://example.com/streamheader.ogg

in the SDP. (and a similar v:fmtp for theora)

Would 'header-url' be a better name?

There are two proposals for the header format itself. One is to use Ogg,
just making available the headers as if it were a normal stream. I don't
think this is onerous. Most players will be linked with libogg anyway to
also handle http streaming, and if not, a non-validating Ogg parser can
return the headers in a page or so of code.

A caveat here: the idea is this is braindead simple if you're
already serving from an Ogg stream, and not hard to construct if
you're not.

However, in the case of a multiplexed Ogg stream, the server may
have to split things out to make the right header set the one the
decoder finds. So we can either always require this, by specifying
the url MUST return a degenerate Ogg stream with the matching
headers and only the matching headers at the start, or we can use
an additional parameter like 'header-index=4' to tell the decoder
to use the fourth logical bitstream it finds. I don't like this
second option on elegance grounds, but being able to use the same
url for both vorbis and theora headers in an A/V stream is an
advantage.

The other proposal is Luca's with just a sequence of 4-byte length,
packet data pairs. This is much simpler to generate and parse, but
if we do this, we really need to define a MIME type under which it
can be served. application/x-vorbis-headers? video/x-theora-headers?

We need to decide if we want to support either of these formats or
both.

For multicast, the 'all headers, all the time' stream still works
fine, using the same RTP payload as the data stream. Aaron's suggestion
was just to have the codebook-url point to an SDP for the header
multicast stream.

Including a hash in the SDP to allow players to cache the
actual setup data still seems like a good idea. perhaps an extra
parameter thus:

v:fmtp codebook-url=http://example.com/stream/rtp8343.header \
;header-md5=5d35bb7210c0a699d28aaf9baf7bb2e0

which the server MAY supply, with a MUST for what it's actually
a hash of.

Finally, Aaron suggested we let the RTP timestamp override the
fixed fps from the theora header/SDP, making a true variable
bitrate stream. This came up in the context of bitrate scalability
but also addresses the resampling cost when cutting between
film and video, for example.

This has another advantage in that detecting dropped frames from
the sequence number can be difficult with fragemented packets, so
relying on the timestamp makes sense. It then become a question
of whether the timestamps have to match the fixed framerate from
the header. If not, however, there's no longer a one-to-one
correspondence with an Ogg stream.

We could make this dependent on the fps being undefined (0 or */0)
in the header.

-r

Aaron Colwell

2005-08-30 14:56:23 UTC

Permalink

Post by Ralph Giles
Aaron and I also talked quite a bit about codebook transmission today.
Things are simplified my having only one set to transmit.
First, the decoder MUST handle headers sent inline in the RTP stream.
Aaron also suggested that all the info header parameters (sample rate,
frame size, number of channels and so on) be mirrored in the SDP as
well. Sounds good to me.

I just want to clarify this a little. What I'm proposing is to have the
ident header hex encoded in the fmtp field in the SDP. This roughly mirrors
how the ESDS is transmitted for MPEG4.

Post by Ralph Giles
For the single-forward channel (broadcast) application, they just have
to be baked in, or use some other transmission mechanism outside the
scope of our spec. Without the chain ID, you can't switch among defined
codebooks though, unless you rely on inline transmission.
For IP unicast, we talked about http download. Something over RTSP would
a:fmtp codebook-url=http://example.com/streamheader.ogg

This doesn't quite follow SDP syntax. Here is an example of what I was
suggesting.

a=fmtp:96 ident=4235AB45FAD4ABD6;codebook-url="http://example.com/streamheader.ogg"

The 96 is the payload type used in the a=rtpmap and m= lines. I put quotes
around the url because ';' is a delimiter for the fmtp line, but is also a
valid URL character.

Post by Ralph Giles
in the SDP. (and a similar v:fmtp for theora)

I think I confused Ralph on IRC with my SDP example. The Theora line should
have the same format as the Vorbis one. There isn't a need to differentiate the
two.

Post by Ralph Giles
Would 'header-url' be a better name?

If the URL only points to the codebook then I think codebook-url is good. If
we are also going to have the ident and comment headers then I think header-url
is probably better.

Post by Ralph Giles
Including a hash in the SDP to allow players to cache the
actual setup data still seems like a good idea. perhaps an extra
v:fmtp codebook-url=http://example.com/stream/rtp8343.header \
;header-md5=5d35bb7210c0a699d28aaf9baf7bb2e0

Here is an example with in the proper fmtp form.

a=fmtp:96 ident=4235AB45FAD4ABD6;codebook-url="http://example.com/stream/rtp8343.header";header-md5=5d35bb7210c0a699d28aaf9baf7bb2e0

Post by Ralph Giles
which the server MAY supply, with a MUST for what it's actually
a hash of.

If the ident is sent in the SDP I think we could say that the MD5 is always for
the codebook only.

Post by Ralph Giles
Finally, Aaron suggested we let the RTP timestamp override the
fixed fps from the theora header/SDP, making a true variable
bitrate stream. This came up in the context of bitrate scalability
but also addresses the resampling cost when cutting between
film and video, for example.
This has another advantage in that detecting dropped frames from
the sequence number can be difficult with fragemented packets, so
relying on the timestamp makes sense. It then become a question
of whether the timestamps have to match the fixed framerate from
the header. If not, however, there's no longer a one-to-one
correspondence with an Ogg stream.

You could still generate a single Ogg stream. It would just have a much higher
frame rate and you'd have to add a bunch of extra "empty" frames to fill in
the gaps. :)

Post by Ralph Giles
We could make this dependent on the fps being undefined (0 or */0)
in the header.

The file still needs to have some frame rate information otherwise the server
doesn't know how to timestamp the data or set the RTP session sample rate. On
the client side it doesn't need to look at the frame rate field in the ident
header. It can just look at the sample rate of the RTP session. If it was
writing data out to a file. It could just change the frame rate in the ident
header to 1 / (RTP sample rate) and then generate "empty frames" for the gaps.
Obviously you might want to add some smarts to figure out what the frame rate
is so that you don't waste a bunch of space on empty frames.

Aaron

Post by Ralph Giles
-r
_______________________________________________
xiph-rtp mailing list
http://lists.xiph.org/mailman/listinfo/xiph-rtp

Ralph Giles

2005-08-30 15:44:50 UTC

Permalink

Post by Aaron Colwell

Post by Ralph Giles
Aaron and I also talked quite a bit about codebook transmission today.

Thanks for the corrections. Sorry I mangled the SDP stuff.

Post by Aaron Colwell
I just want to clarify this a little. What I'm proposing is to have the
ident header hex encoded in the fmtp field in the SDP. This roughly mirrors
how the ESDS is transmitted for MPEG4.

Ah, and *not* broken out? I assumed there were standard keys for things
like frame size and sample rate that generic applicaions would want to
see.

Post by Aaron Colwell

This doesn't quite follow SDP syntax. Here is an example of what I was
suggesting.
a=fmtp:96 ident=4235AB45FAD4ABD6;codebook-url="http://example.com/streamheader.ogg"
The 96 is the payload type used in the a=rtpmap and m= lines. I put quotes
around the url because ';' is a delimiter for the fmtp line, but is also a
valid URL character.

Post by Ralph Giles
in the SDP. (and a similar v:fmtp for theora)

I think I confused Ralph on IRC with my SDP example. The Theora line should
have the same format as the Vorbis one. There isn't a need to differentiate the
two.

Post by Ralph Giles
Would 'header-url' be a better name?

If the URL only points to the codebook then I think codebook-url is good. If
we are also going to have the ident and comment headers then I think header-url
is probably better.

Here is an example with in the proper fmtp form.
a=fmtp:96 ident=4235AB45FAD4ABD6;codebook-url="http://example.com/stream/rtp8343.header";header-md5=5d35bb7210c0a699d28aaf9baf7bb2e0

Post by Ralph Giles
which the server MAY supply, with a MUST for what it's actually
a hash of.

If the ident is sent in the SDP I think we could say that the MD5 is always for
the codebook only.

You could still generate a single Ogg stream. It would just have a much higher
frame rate and you'd have to add a bunch of extra "empty" frames to fill in
the gaps. :)

Post by Ralph Giles
We could make this dependent on the fps being undefined (0 or */0)
in the header.

I was suggesting that if we intend the encoder to be able to optionally
mess with the frame rate, it should signal that it will do so by setting
the fps in the ident header to 0 or undefined in the RTP stream. A
source Ogg Theora stream would of course still have to have a fixed
frame rate.

We can also make it not optional. I was just worrying about the decoder
knowing whether is should be validating/rounding timestamps or not.

I'd also point out that timestamps still occur only once per RTP packet,
which with packing may span multiple theora packets. Given the relative
sizes of internet MTU and theora packets it's not a huge problem in
practice to just be one-to-one, but we should define what happens if
somebody doesn't do that. Is the decoder supposed to interpolate the
display times?

-r