Discussion:
[xiph-rtp] header ident
Ralph Giles
2005-04-06 01:28:12 UTC
Permalink
Hey, everyone,

I'd like to get things going again with the RTP spec. I remember us
being stalled on derf's objection that the CRC32 mapping between the
Payload Header's codec setup field and the two (or three) setup packets
themselves had too high a risk of collision.

Seems to me we have the following options:

We cannot make the the ident field larger because of overhead. We must
therefore break the general mapping between the ident field and the
decoder setup. So in general it is the problem of the stream description
protocol to establish this mapping.

Now, we can retain the 32 bit ident, so that applications that find the
CRC "good enough" can still use it.

If we do not do that, and it is a general feature of xiph designs to
avoid optional features, then I would argue 32 bits is too much.
Personally, I think 8 bits is plenty, since you have to establish the
mapping in the session description anyway, and 256 chain segment
variations should be enough for anybody. :) But I don't think chaining
is that important to support. We could be a committee and compromise on
16 bits.

So, 8, 16, or 32? That's one question. I notice that 24 would let
us word-align the packet data, which has advantages for embedded
implementations. And another general feature of xiph designs is
excessive room for future expansion. Therefore I'd propose the
following payload header:

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Setup Ident | Reserved |C|F|R| # pkts. |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

What do you think? I don't have good intuition on how a one byte
overhead of dubious future usefulness weighs against simplicity
of packet construction.

-r
Ralph Giles
2005-05-01 14:52:15 UTC
Permalink
Post by Ralph Giles
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Setup Ident | Reserved |C|F|R| # pkts. |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Discussion on this point seems to have petered out. I'm going to decide
executively in favor of this proposal rather than Aaron's, with the
blessing of Fluendo. Phil, please (again) update the draft to reflect
this new payload header, and Aaron's changes to the meaning of the C,F
bits.

Rationale for the record: Aaron's variable-length setup ident encoding
is much more flexible, but also adds complexity. While it's true that's
nothing compared actual vorbis or theora decode, the packetizer is not
necessarily an integral part of the decoder. I'm particularly worried
about the additional decision overhead for embedded use, where the fix
32 bit alignment makes things dead easy. It's also a stronger decision
against the 32 bit CRC proposal, preventing any kind of application-
specific fallback. Finally, next to the 32 bit SRC id's and timestamp
in the RTP header itself, the 3 byte overhead for non-chained streams
felt like a reasonable balance to me.

So there we go, progress!

I've made no decision on the out-of-bound setup transmission issue
though. Storage formats, levels of indirection and so on are still up
in the air. Really, I'd like to see some implementations try different
things.

One this I think we can define is in-band transmission. Just send the
header packets in the RTP stream like they were any other type. The
16 bit setup ident field of the header packets must match the setup
ident of the data packets that rely on them. In the default case the
three standard headers are sent at the beginning of the stream, just
as they are in Ogg. The server MAY also retransmit the headers
periodically; clients MUST handle such header packets when they occur.

-r
Phil Kerr
2005-05-01 16:28:16 UTC
Permalink
Post by Ralph Giles
Post by Ralph Giles
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Setup Ident | Reserved |C|F|R| # pkts. |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Discussion on this point seems to have petered out. I'm going to decide
executively in favor of this proposal rather than Aaron's, with the
blessing of Fluendo. Phil, please (again) update the draft to reflect
this new payload header, and Aaron's changes to the meaning of the C,F
bits.
Will do.
Post by Ralph Giles
Rationale for the record: Aaron's variable-length setup ident encoding
is much more flexible, but also adds complexity. While it's true that's
nothing compared actual vorbis or theora decode, the packetizer is not
necessarily an integral part of the decoder. I'm particularly worried
about the additional decision overhead for embedded use, where the fix
32 bit alignment makes things dead easy. It's also a stronger decision
against the 32 bit CRC proposal, preventing any kind of application-
specific fallback. Finally, next to the 32 bit SRC id's and timestamp
in the RTP header itself, the 3 byte overhead for non-chained streams
felt like a reasonable balance to me.
So there we go, progress!
Yippee!
Post by Ralph Giles
I've made no decision on the out-of-bound setup transmission issue
though. Storage formats, levels of indirection and so on are still up
in the air. Really, I'd like to see some implementations try different
things.
This can be looked at later.
Post by Ralph Giles
One this I think we can define is in-band transmission. Just send the
header packets in the RTP stream like they were any other type. The
16 bit setup ident field of the header packets must match the setup
ident of the data packets that rely on them.
Do we care about setup ident collisions?
Post by Ralph Giles
In the default case the
three standard headers are sent at the beginning of the stream, just
as they are in Ogg. The server MAY also retransmit the headers
periodically; clients MUST handle such header packets when they occur.
This is catered for in the current documentation.

-P
Post by Ralph Giles
-r
_______________________________________________
xiph-rtp mailing list
http://lists.xiph.org/mailman/listinfo/xiph-rtp
Ralph Giles
2005-05-01 16:38:14 UTC
Permalink
Post by Ralph Giles
Phil, please (again) update the draft to reflect
Post by Ralph Giles
this new payload header, and Aaron's changes to the meaning of the C,F
bits.
Will do.
Thanks!
Post by Ralph Giles
Post by Ralph Giles
One this I think we can define is in-band transmission. Just send the
header packets in the RTP stream like they were any other type. The
16 bit setup ident field of the header packets must match the setup
ident of the data packets that rely on them.
Do we care about setup ident collisions?
The ident needs to match so the decoder knows which packets they apply
to; this allows the in-band transmission to work without a prior SDP
exchange. The encoder MUST ensure that the ident->decode setup mapping
is unique within a given RTP session. Is that what you meant?

-r
Aaron Colwell
2005-05-02 14:08:32 UTC
Permalink
Post by Ralph Giles
Rationale for the record: Aaron's variable-length setup ident encoding
is much more flexible, but also adds complexity. While it's true that's
nothing compared actual vorbis or theora decode, the packetizer is not
necessarily an integral part of the decoder. I'm particularly worried
about the additional decision overhead for embedded use, where the fix
32 bit alignment makes things dead easy. It's also a stronger decision
against the 32 bit CRC proposal, preventing any kind of application-
specific fallback. Finally, next to the 32 bit SRC id's and timestamp
in the RTP header itself, the 3 byte overhead for non-chained streams
felt like a reasonable balance to me.
When you say "embedded" are you refering to a hardware implementation of this
packetization scheme or just it's use in an embedded device? If it's the later
then I don't see the tiny bit of extra complexity something to worry about.
Robust sorted and interleaved AMR-NB packetization (RFC 3267) is WAY more
complex than this and I've had it running perfectly fine on a 104MHz ARM that
is also doing H.263 decoding at the same time. Most multimedia embedded devices
these days have more horsepower than this so I don't think that the variable
length ID is that big of a complexity overhead.

Is it really likely that a hardware encoder will actually use
several chains in a single transmission? If not then there is no extra
complexity for the variable length field. If your doing hardware
decode it seems likely to me that depacketization and RTCP ops would be done
by a general purpose processor and the hardware would only do decoding. If that
is the case then my earlier argument applies. If you aren't using a general
purpose processor, then the logic needed for the IP and RTP stacks will be
WAY LARGER than the logic to support this variable length field.

I also don't really understand your argument about comparing the variable
length ID field complexity with the decode complexity. Hardware bitstream
parsing is probably several orders of magnitude more complex than the variable
length field. I doubt the added logic needed to parse the variable length ID
field would even be noticable compared to the amount of logic needed to parse
the bitstream. Yes there is more decision logic, but it is extremely trivial.

Just my $0.02

Aaron
Ralph Giles
2005-05-02 15:28:49 UTC
Permalink
Post by Aaron Colwell
When you say "embedded" are you refering to a hardware implementation of this
packetization scheme or just it's use in an embedded device?
You know, the hand-wavy, "won't someone think of the embedded
programmers" sense! :-)

Your point is good that the variable length ident coding adapts to any
alignment requirement. In the light of that, my rationale reduces to
greater simplicity with the fixed-length, aligned payload header.

Alos for the record, we discussed this on irc, and Aaron accepts the
decision, regardless.

-r
Ralph Giles
2005-05-02 17:07:48 UTC
Permalink
Post by Ralph Giles
One this I think we can define is in-band transmission. Just send the
header packets in the RTP stream like they were any other type. The
16 bit setup ident field of the header packets must match the setup
ident of the data packets that rely on them. In the default case the
three standard headers are sent at the beginning of the stream, just
as they are in Ogg. The server MAY also retransmit the headers
periodically; clients MUST handle such header packets when they occur.
Some further clarification from discussion with Aaron on IRC:

Clients MUST handle any header packet at any point in the RTP stream;
there is no restriction on ordering or grouping. When using in-band
transmission, the server SHOULD send the initial 3 header packets, in
order, at the beginning of the stream, including the comment header even
if it contains only the required vendor string from the encoder.

The server MUST ensure that the setup ident field in the payload header
attached to any ident or setup header packets is a consistent and unique
mapping within an RTP session, so that clients can safely discard such
packets if they already have a decode setup for that ident field.

The server MAY choose to send multiple comment packets with the same
ident field value. Such packets, when received, indicate to the client
that the new set of metadata applies to the audio data that follows. In
this way song title changes and so on can be indicated without the
overhead of a chain boundary. and sending a completely new set of
headers. Applications spooling an RTP stream to an Ogg stream or file
should insert a chain boundary or begin a new file when these
metadata updates occur, even though the decoder setup has not changed.

-r
Aaron Colwell
2005-05-02 17:44:09 UTC
Permalink
Post by Ralph Giles
The server MAY choose to send multiple comment packets with the same
ident field value. Such packets, when received, indicate to the client
that the new set of metadata applies to the audio data that follows. In
this way song title changes and so on can be indicated without the
overhead of a chain boundary. and sending a completely new set of
headers. Applications spooling an RTP stream to an Ogg stream or file
should insert a chain boundary or begin a new file when these
metadata updates occur, even though the decoder setup has not changed.
I just want to add a little clarification here. A chain boundry only needs
to be created if the comment packet doesn't exactly match the metadata received
in a previous comment packet for that chainID. This allows periodic
transmission of the comment packet without a spooler creating chains for every
comment packet that comes in. (Think archiving a multicast radio feed)

Aaron
Post by Ralph Giles
-r
_______________________________________________
xiph-rtp mailing list
http://lists.xiph.org/mailman/listinfo/xiph-rtp
Phil Kerr
2005-05-02 19:57:23 UTC
Permalink
Post by Ralph Giles
Post by Ralph Giles
One this I think we can define is in-band transmission. Just send the
header packets in the RTP stream like they were any other type. The
16 bit setup ident field of the header packets must match the setup
ident of the data packets that rely on them. In the default case the
three standard headers are sent at the beginning of the stream, just
as they are in Ogg. The server MAY also retransmit the headers
periodically; clients MUST handle such header packets when they occur.
Clients MUST handle any header packet at any point in the RTP stream;
there is no restriction on ordering or grouping. When using in-band
transmission, the server SHOULD send the initial 3 header packets, in
order, at the beginning of the stream, including the comment header even
if it contains only the required vendor string from the encoder.
This is detailed in the current specification.
Post by Ralph Giles
The server MUST ensure that the setup ident field in the payload header
attached to any ident or setup header packets is a consistent and unique
mapping within an RTP session, so that clients can safely discard such
packets if they already have a decode setup for that ident field.
Because the Ident field is freeform (not derived from the stream itself
using crc32) you will lose caching.
Post by Ralph Giles
The server MAY choose to send multiple comment packets with the same
ident field value. Such packets, when received, indicate to the client
that the new set of metadata applies to the audio data that follows.
This is detailed in the current specification.

-P
Post by Ralph Giles
-r
_______________________________________________
xiph-rtp mailing list
http://lists.xiph.org/mailman/listinfo/xiph-rtp
Aaron Colwell
2005-05-02 20:07:12 UTC
Permalink
Post by Phil Kerr
Post by Ralph Giles
The server MUST ensure that the setup ident field in the payload header
attached to any ident or setup header packets is a consistent and unique
mapping within an RTP session, so that clients can safely discard such
packets if they already have a decode setup for that ident field.
Because the Ident field is freeform (not derived from the stream itself
using crc32) you will lose caching.
You can still have caching. You just need the ident -> chainID and
codebook -> chainID mappings conveyed out of band or at least have a
base URL that you can use to grab the mapping information. This info can be
conveyed via SDP.

I suppose the base URL and/or chain ID to codebook/ident hashs could be
transmitted inline as well. That info should be pretty low BW.

Aaron
Phil Kerr
2005-05-02 20:23:11 UTC
Permalink
You just need the ident -> chainID and codebook -> chainID mappings conveyed out of band
And this is better than having it inband?
Aaron Colwell
2005-05-02 20:42:03 UTC
Permalink
If you do it out of band then you can use a reliable transport. For example if
this info is put in the SDP then it will be reliably tranmitted to the client
over the RTSP link. In band signalling can be lossy so you have to do periodic
transmission to insure reliability.

Aaron
Post by Phil Kerr
You just need the ident -> chainID and codebook -> chainID mappings conveyed out of band
And this is better than having it inband?
_______________________________________________
xiph-rtp mailing list
http://lists.xiph.org/mailman/listinfo/xiph-rtp
Loading...