Discussion:
[xiph-rtp] A few other comments
Tor-Einar Jarnbjo
2005-01-04 21:54:45 UTC
Permalink
Hi,

a few things I stumbled across in the latest draft version, related to new
paragraphs or things I haven't noticed before:

* 2.2 Payload Header

* Codebook Ident: 32 bits
*
* This 32 bit field is used to associate the Vorbis data to a decoding
* Codebook. It is created by making a CRC32 checksum of the codebook
* required to decode the particular Vorbis audio stream.

It should be defined exactly which data is being used to create the codebook
checksum. One thing, which is not obvious to me is if the packet type
descriptor shall be included when calculating the checksum. I am also
wondering if it's not better (to avoid random checksum duplicates) to use a
better hash algorithm than CRC32.

* 3. Frame Packetizing
*
* Any Vorbis data packet that is 256 octets or less SHOULD be bundled
* in the RTP packet with as many Vorbis packets as will fit, up to a
* maximum of 16.

Wouldn't it make sense here to have some more formal description of the RTP
packet length than just "as will fit"?

* If a Vorbis packet is larger than 256 octets it MUST be fragmented.

Is it really necessary to have a one octet Vorbis packet length field in the
RTP packet and limit the Vorbis packet length to 256 octets? Most Vorbis
packets are longer (at common bitrates, I would estimate 500-1000 bytes) and
most networks paths have a higher MTU, making them able to transport common
Vorbis packets without splitting them, hence causing unnecessary bandwidth
overhead. The UDP packet header, the RTP packet header and the 6 octets for
the Vorbis specific data adds up to about 50 octets. That is at least 20% if
the rest of the data is limited to 256 octets.

* 4. Configuration Headers
*
* To decode a Vorbis stream three configuration header blocks are
* needed. The first header indicates the sample and bitrates, the
* number of channels and the version of the Vorbis encoder used. The
* second header contains the decoders probability model, or codebook
* and the third header details stream metadata.

The metadata header block is not needed for stream decoding. It was already
discussed to even recommend using some other metadata format together with
Vorbis/RTP, as the Vorbis metadata header is rather limited.

* 4.1 In-band Header Transmission
*
* The three header data blocks are sent in-band with the packet type
* bits set to match the payload type. The transmission sequence for
* the headers MUST be in this order: configuration, codebook,
* metadata.

This MUST is technically irrelevant. Even if the Vorbis spec itself has this
unecessary "MUST", I don't see any point in copying it in the RTP RFC. As
long as the configuration and the codebook header has been received properly
by the client (in any order), it is able to decode the audio stream.

* A 16 bit codebook length field precedes the codebook datablock. The
* length field allows for codebooks to be up to 64K in size.

Is it ok to limit the codebook length to 64 kilobytes? Is it ok to
abbreviate kilobyte with K? The text graphic on page 13 shows a 32 bit
codebook length field.

Why is there a codebook ident in the configuration header packet?

The first paragraph on page 13 is not clear to me. Will the codebook ident
of the codebook packet itself in some cases differ from the actually
transmitted codebook? In any case, what's the point in including it? AFAIK,
RTP depends on reliable packet transport anyway, so there is no need to add
checksums, just to detect corrupted data.

Because of the same reason why a more detailed description of the checksum
calculation is necessary, it should also be described in section 4.3 in
which format the codebook header will be delivered from the specified URI
(at least with or without packet type descriptor).


Tor
Phil Kerr
2005-01-05 00:09:51 UTC
Permalink
Hi Tor,
Post by Tor-Einar Jarnbjo
Hi,
a few things I stumbled across in the latest draft version, related to new
* 2.2 Payload Header
* Codebook Ident: 32 bits
*
* This 32 bit field is used to associate the Vorbis data to a decoding
* Codebook. It is created by making a CRC32 checksum of the codebook
* required to decode the particular Vorbis audio stream.
It should be defined exactly which data is being used to create the codebook
checksum. One thing, which is not obvious to me is if the packet type
descriptor shall be included when calculating the checksum. I am also
wondering if it's not better (to avoid random checksum duplicates) to use a
better hash algorithm than CRC32.
We discussed this on the list, not too long ago, about using CRC32 and
the consensus was collisions wouldn't be an issue, and moving to
something like MD5 would be too heavy. If collisions are a problem then
this approach will need to be looked at again. The reason for having
this in the stream is so there is a hard association between the raw
Vorbis data and its associated codebook. What replacement to CRC32 did
you have in mind?
Post by Tor-Einar Jarnbjo
* 3. Frame Packetizing
*
* Any Vorbis data packet that is 256 octets or less SHOULD be bundled
* in the RTP packet with as many Vorbis packets as will fit, up to a
* maximum of 16.
Wouldn't it make sense here to have some more formal description of the RTP
packet length than just "as will fit"?
Possibly, although the language is slightly loose it's ok.
Post by Tor-Einar Jarnbjo
* If a Vorbis packet is larger than 256 octets it MUST be fragmented.
Is it really necessary to have a one octet Vorbis packet length field in the
RTP packet and limit the Vorbis packet length to 256 octets? Most Vorbis
packets are longer (at common bitrates, I would estimate 500-1000 bytes) and
most networks paths have a higher MTU, making them able to transport common
Vorbis packets without splitting them, hence causing unnecessary bandwidth
overhead. The UDP packet header, the RTP packet header and the 6 octets for
the Vorbis specific data adds up to about 50 octets. That is at least 20% if
the rest of the data is limited to 256 octets.
The length octet dates back to Jack's original draft so I'm sure he can
answer this better than me. But if this was bumped up to 16 bits then
more often than not the top 6 or 7 bits aren't going to be used, is this
not a waste?
Post by Tor-Einar Jarnbjo
* 4. Configuration Headers
*
* To decode a Vorbis stream three configuration header blocks are
* needed. The first header indicates the sample and bitrates, the
* number of channels and the version of the Vorbis encoder used. The
* second header contains the decoders probability model, or codebook
* and the third header details stream metadata.
The metadata header block is not needed for stream decoding. It was already
discussed to even recommend using some other metadata format together with
Vorbis/RTP, as the Vorbis metadata header is rather limited.
Ok, I must have missed where dropping the metadata header was
discussed. I understand that Annodex has been mentioned but is this to
be used as a 100% replacement? This will complicate implementation as
you have to create two RTP streams - one for audio and one for
metadata. I thought that Vorbis has the basic metadata which can be
augmented with Annodex if the particular usage instance required
something mor detailed.
Post by Tor-Einar Jarnbjo
* 4.1 In-band Header Transmission
*
* The three header data blocks are sent in-band with the packet type
* bits set to match the payload type. The transmission sequence for
* the headers MUST be in this order: configuration, codebook,
* metadata.
This MUST is technically irrelevant. Even if the Vorbis spec itself has this
unecessary "MUST", I don't see any point in copying it in the RTP RFC. As
long as the configuration and the codebook header has been received properly
by the client (in any order), it is able to decode the audio stream.
Yes, this has slipped through and can be removed.
Post by Tor-Einar Jarnbjo
* A 16 bit codebook length field precedes the codebook datablock. The
* length field allows for codebooks to be up to 64K in size.
Is it ok to limit the codebook length to 64 kilobytes? Is it ok to
abbreviate kilobyte with K? The text graphic on page 13 shows a 32 bit
codebook length field.
This was discussed a long while ago and the consensus was, dare I say
it, 64k should be big enough for anyones codebook.
Post by Tor-Einar Jarnbjo
Why is there a codebook ident in the configuration header packet?
The first paragraph on page 13 is not clear to me. Will the codebook ident
of the codebook packet itself in some cases differ from the actually
transmitted codebook? In any case, what's the point in including it? AFAIK,
RTP depends on reliable packet transport anyway, so there is no need to add
checksums, just to detect corrupted data.
RTP is based on UDP, not TCP, so there is no reliability or
retransmission ability built-in. The checksum is there to offer some
mechanism to check if the packet has been corrupted, and as the
codebooks have to be delivered intact it's important there is some
safeguard mechanism. If they are being fetched over TCP from the SDP
URI (is this what you meant?) then yes, it's redundant but it doesn't
really cause a problem by having it in as it still provides a check.
Post by Tor-Einar Jarnbjo
Because of the same reason why a more detailed description of the checksum
calculation is necessary, it should also be described in section 4.3 in
which format the codebook header will be delivered from the specified URI
(at least with or without packet type descriptor).
Agreed.


-P
Post by Tor-Einar Jarnbjo
Tor
_______________________________________________
xiph-rtp mailing list
http://lists.xiph.org/mailman/listinfo/xiph-rtp
Tor-Einar Jarnbjo
2005-01-06 21:19:57 UTC
Permalink
Hi Phil,
Post by Phil Kerr
Vorbis data and its associated codebook. What replacement to CRC32 did
you have in mind?
I was thninking about something like MD5, but CRC32 will probably do just as
good.
Post by Phil Kerr
The length octet dates back to Jack's original draft so I'm sure he can
answer this better than me. But if this was bumped up to 16 bits then
more often than not the top 6 or 7 bits aren't going to be used, is this
not a waste?
I just picked and analyzed a random Vorbis file and it has a total of 26465
packets with the following sizes:

10130 [0,256> bytes
14505 [256,512> bytes
1830 [512,615] bytes

Extending the length field to 16 bits, all packets may be transmitted
without splitting, but it adds a 26465 bytes overhead to the total data
amount.

Splitting the 16335 packets with a length >= 256 bytes however, will cause
an additional 18165 packets to be transmitted (14505 + 2*1830). Assuming a
50 byte overhead for the UDP header, RTP header and the Vorbis specific
header, it adds a ~900kB to the total data amount.
Post by Phil Kerr
I thought that Vorbis has the basic metadata which can be
augmented with Annodex if the particular usage instance required
something mor detailed.
Ok, I sort of agree on this, but wouldn't it at least be wise to define the
Vorbis metadata header as optional and not mandatory. If a particular setup
decides to use any other metadata format, it would allow a cleaner design if
the Vorbis metadata is completely left out instead of forcing the server to
send an empty Vorbis metadata header, just to satisfy the RFC. In many
cases, even the limited metadata fields in the SDP may be enough for a
particular purpose.
Post by Phil Kerr
Post by Tor-Einar Jarnbjo
Why is there a codebook ident in the configuration header packet?
RTP is based on UDP, not TCP, so there is no reliability or
retransmission ability built-in.
This depends on what you mean with "reliability". UDP does not offer a
guarantee that the packet is being received by the client, but _if_ the
packet is being received, you can rely on its correct content, as the UDP
header contains a CRC field to ensure data integrity, which has to be
checked by the UDP stack before forwarding the received packet to the user
software.

Tor
Ralph Giles
2005-01-06 21:41:52 UTC
Permalink
Post by Tor-Einar Jarnbjo
Ok, I sort of agree on this, but wouldn't it at least be wise to define the
Vorbis metadata header as optional and not mandatory. If a particular setup
decides to use any other metadata format, it would allow a cleaner design if
the Vorbis metadata is completely left out instead of forcing the server to
send an empty Vorbis metadata header, just to satisfy the RFC. In many
cases, even the limited metadata fields in the SDP may be enough for a
particular purpose.
Well, it won't be empty; the metadata packet always has a vendor string
identifying the encoder. Maybe that's not important.

Part of the idea here I think it so avoid contradicting the vorbis spec
more than necessary. If you pass all the headers as expected and they're
recieved properly, in a brain-dead unicast case you can just feed them
to libvorbis with no extra fuss. Not that (fixable) programming
inconvenience is an argument, but regularity of expectations is.

-r
Phil Kerr
2005-01-07 00:05:38 UTC
Permalink
Hi Tor,
Post by Tor-Einar Jarnbjo
Hi Phil,
Post by Phil Kerr
Vorbis data and its associated codebook. What replacement to CRC32 did
you have in mind?
I was thninking about something like MD5, but CRC32 will probably do just as
good.
MD5 is more collision-proof, but its size means it is impractical to tag
each packet with a codebook key and this was one of the design goals,
having a hard association between stream and decoding.
Post by Tor-Einar Jarnbjo
Post by Phil Kerr
The length octet dates back to Jack's original draft so I'm sure he can
answer this better than me. But if this was bumped up to 16 bits then
more often than not the top 6 or 7 bits aren't going to be used, is this
not a waste?
I just picked and analyzed a random Vorbis file and it has a total of 26465
10130 [0,256> bytes
14505 [256,512> bytes
1830 [512,615] bytes
Extending the length field to 16 bits, all packets may be transmitted
without splitting, but it adds a 26465 bytes overhead to the total data
amount.
Splitting the 16335 packets with a length >= 256 bytes however, will cause
an additional 18165 packets to be transmitted (14505 + 2*1830). Assuming a
50 byte overhead for the UDP header, RTP header and the Vorbis specific
header, it adds a ~900kB to the total data amount.
I've just had a look at a few files.

||---------------------------------------------------------------------------||
|| vsize: Vorbis frame size stat generator.
||
|| Filename: qq-2004-12-11c.ogg
||
||---------------------------------------------------------------------------||
|| Stats.
||
|| Total number of Vorbis frames: 29716
|| Number of small (<= 256 bytes) Vorbis frames: 29716
|| Number of medium (> 256 && < 512 bytes) Vorbis frames: 0
|| Number of big (> 512 && < 1024 bytes) Vorbis frames: 0
|| Number of jumbo (>= 1024 bytes) Vorbis frames: 0

This file is from a CBC recording that Ralph tried with a very early
version of the RTP server. Notice it has no Vorbis frames greater than
256 bytes.

The next file is an encoding of a CD rip.

||---------------------------------------------------------------------------||
|| vsize: Vorbis frame size stat generator.
||
|| Filename: ../tunes/max_graham__tranceport4_disk_2.ogg
||
||---------------------------------------------------------------------------||
|| Stats.
||
|| Total number of Vorbis frames: 567071
|| Number of small (<= 256 bytes) Vorbis frames: 437171
|| Number of medium (> 256 && < 512 bytes) Vorbis frames: 698
|| Number of big (> 512 && < 1024 bytes) Vorbis frames: 5011
|| Number of jumbo (>= 1024 bytes) Vorbis frames: 124191

Lots of big frames!

Both files were encoded with the Vorbis I 20020717 (1.0) libs, the first
one has an average bitrate of 52.27 kbps, the second 469.14 kbps.

I certainly think this issue needs to be looked at further as whilst the
change is quite small it does have a large impact on the stream
overheads. Another side effect of increasing the limit to 16 bits is
the probable increase in the number of packets that will exceed path MTU
and will fragment.

[I'll put the vsize code at http://plus24.com/xiph/vsize.c.gz if anyone
is interested in generating some Vorbis frame stats.]
Post by Tor-Einar Jarnbjo
Post by Phil Kerr
I thought that Vorbis has the basic metadata which can be
augmented with Annodex if the particular usage instance required
something mor detailed.
Ok, I sort of agree on this, but wouldn't it at least be wise to define the
Vorbis metadata header as optional and not mandatory. If a particular setup
decides to use any other metadata format, it would allow a cleaner design if
the Vorbis metadata is completely left out instead of forcing the server to
send an empty Vorbis metadata header, just to satisfy the RFC. In many
cases, even the limited metadata fields in the SDP may be enough for a
particular purpose.
As Ralph has mentioned, this is to keep everything happy with regards to
the official Vorbis specs, which mandate that the metadata header is
required, even if the only info is the vendor string.

I see the use of the basic 'built-in' metadata as providing your
bog-standard stream/track info but with Annodex you can offer a far
richer description. Some scenarios the basic model will be fine but
others (plays or opera being good examples) having lots of plot metadata
would be very good! I covered this for opera in one of my former
projects using MPEG-7 and Silvia has done similar work with Annodex
which goes in a slightly different direction, Silvia, can you elaborate
on this?
Post by Tor-Einar Jarnbjo
Post by Phil Kerr
Post by Tor-Einar Jarnbjo
Why is there a codebook ident in the configuration header packet?
RTP is based on UDP, not TCP, so there is no reliability or
retransmission ability built-in.
This depends on what you mean with "reliability". UDP does not offer a
guarantee that the packet is being received by the client, but _if_ the
packet is being received, you can rely on its correct content, as the UDP
header contains a CRC field to ensure data integrity, which has to be
checked by the UDP stack before forwarding the received packet to the user
software.
True, but if you have a codebook that spans multiple packets it's a lot
easier to deal with one CRC, than several. Also the packet CRC covers
the headers, not just the payload (iirc, need to check this).
Post by Tor-Einar Jarnbjo
Tor
Cheers

-P

Loading...