Discussion:
[xiph-rtp] Chaining Theora codebook changes
David Barrett
2005-05-21 03:47:41 UTC
Permalink
Hi, I'm still getting up to speed on the topic so please forgive my
ignorance, but does "chaining" mean "changing codebooks on the fly"?
If so, what is the latest proposal on this?

Looking over the archive, it sounds like:

- The server CAN specify one or more codebooks URLs in SDP
- The server CAN specify new codebooks at will inline
- The server CAN change codebooks at will, and the client MUST accept

Are these statements true?

I know the precise language and syntax are in flux, but it sounds like
each packet will have some kind of variable-length codebook
identifier, effectively choosing from the codebooks predefined by the
SDP. Is this true? Can I further choose not only from those defined
in the SDP at the start, but those delivered inline since?

Next, are there any restrictions on what can change with the
codebook? Can I change framerates, bitrates, resolutions -- basically
anything I want to change?

Finally, if I have a stream that starts with codebook A, switches to
B, and then switches back to A, is it assumed that A picks up "where
it left off", or do I reset the encoder with codebook A? Is switching
codebooks (assuming I don't need to re-download them) an expensive
operation?

Thanks for your clarifications; just trying to make sure I have my
assumptions in line so I don't march myself into a corner down the
road.

-david
Ralph Giles
2005-05-21 06:18:35 UTC
Permalink
Post by David Barrett
Hi, I'm still getting up to speed on the topic so please forgive my
ignorance, but does "chaining" mean "changing codebooks on the fly"?
If so, what is the latest proposal on this?
The idea is to do the same thing as the latest consensus for vorbis.
That's only documented in the list archives at the moment.
Post by David Barrett
- The server CAN specify one or more codebooks URLs in SDP
- The server CAN specify new codebooks at will inline
- The server CAN change codebooks at will, and the client MUST accept
Are these statements true?
well, 'at will' has some caveats, but basically, yes.
Post by David Barrett
I know the precise language and syntax are in flux, but it sounds like
each packet will have some kind of variable-length codebook
identifier, effectively choosing from the codebooks predefined by the
SDP. Is this true? Can I further choose not only from those defined
in the SDP at the start, but those delivered inline since?
Yes. there's also a proposal for an url construction scheme, so that the
server can generate new codebooks on the fly outside what's in the SDP
and the client can construct a url to retrieve them.
Post by David Barrett
Next, are there any restrictions on what can change with the
codebook? Can I change framerates, bitrates, resolutions -- basically
anything I want to change?
In principle anything in the decoder setup can be changed. I'm not sure
how things like frame size and rate which have standard representations
in the SDP would work. It may be some session estabilishment protocols
will end up placing additional limits.
Post by David Barrett
Finally, if I have a stream that starts with codebook A, switches to
B, and then switches back to A, is it assumed that A picks up "where
it left off", or do I reset the encoder with codebook A? Is switching
codebooks (assuming I don't need to re-download them) an expensive
operation?
Not sure what you mean here. Parsing the codebook does take a little bit
of time, but should be small compared to frame decode or encode. One can
also of course cache the parsed codec setup and switch simply by sending
frames to one context or another.

How the encoder generates a particular set of codebooks could be very
expensive, of course, if it's tuned to a particular input stream for
example. It's not a problem on decode.
Post by David Barrett
Thanks for your clarifications; just trying to make sure I have my
assumptions in line so I don't march myself into a corner down the
road.
Sounds like you have it more or less right. The idea is really just to
support the same chaining you can do with the Ogg container, where
playback of the concatentation of any two streams must be handled,
preferrably in as gapless a manner as possible.

-r
David Barrett
2005-05-22 20:34:40 UTC
Permalink
Post by Ralph Giles
Post by David Barrett
Finally, if I have a stream that starts with codebook A, switches to
B, and then switches back to A, is it assumed that A picks up "where
it left off", or do I reset the encoder with codebook A?
Not sure what you mean here. Parsing the codebook does take a little bit
of time, but should be small compared to frame decode or encode. One can
also of course cache the parsed codec setup and switch simply by sending
frames to one context or another.
I'm sorry, let me clarify. It's my assumption that the codebook is
used to initialize the internal state of the decoder, and then each
frame modifies that internal state. So, if I switch from codebook 0
to codebook 1, and then back to codebook 0, I have two options:

1) Destroy and recreate a decoder using codebook 0, thereby resetting
its internal state.

2) Reuse the original decoder, thereby continuing on with the internal
state that has already been accumulated.

This is all based on the assumption that there is internal state
accumulated through time. If the decoder is essentially stateless,
then the above two options are the same (whether I create a new
decoder or reuse the old produces the same result as there is no
internal state being lost). So if this assumption is correct, which
of the above two options should I choose? I'm guessing #1, but I'd
like to confirm.

Thanks!

-david
Ralph Giles
2005-05-23 02:56:06 UTC
Permalink
Post by David Barrett
I'm sorry, let me clarify. It's my assumption that the codebook is
used to initialize the internal state of the decoder, and then each
frame modifies that internal state. So, if I switch from codebook 0
1) Destroy and recreate a decoder using codebook 0, thereby resetting
its internal state.
2) Reuse the original decoder, thereby continuing on with the internal
state that has already been accumulated.
Aha. So there are two things. The codebook does indeed initialize
internal state, but that state is orthogonal to the state changes
induced by decoding data packets.

The state related to frame decode consists soley of the previous
decoded frame, and the previous keyframe (if different). So you
will get incorrect results if you begin decoding anywhere but
a keyframe, but decoding frames (key or otherwise) does not affect
the state associated with decoding the header packets (codebook).

So your options (1) and (2) behave identically, except with respect
to the specific variety of bad output you'd get if you didn't
restart decoding under codebook 0 with a keyframe. If you do
start with a keyframe, as you should, there's no difference,
and it's just a question of convenience and memory footprint.

-r
David Barrett
2005-05-23 05:19:42 UTC
Permalink
Ah, good. Thanks!
Post by Ralph Giles
Post by David Barrett
I'm sorry, let me clarify. It's my assumption that the codebook is
used to initialize the internal state of the decoder, and then each
frame modifies that internal state. So, if I switch from codebook 0
1) Destroy and recreate a decoder using codebook 0, thereby resetting
its internal state.
2) Reuse the original decoder, thereby continuing on with the internal
state that has already been accumulated.
Aha. So there are two things. The codebook does indeed initialize
internal state, but that state is orthogonal to the state changes
induced by decoding data packets.
The state related to frame decode consists soley of the previous
decoded frame, and the previous keyframe (if different). So you
will get incorrect results if you begin decoding anywhere but
a keyframe, but decoding frames (key or otherwise) does not affect
the state associated with decoding the header packets (codebook).
So your options (1) and (2) behave identically, except with respect
to the specific variety of bad output you'd get if you didn't
restart decoding under codebook 0 with a keyframe. If you do
start with a keyframe, as you should, there's no difference,
and it's just a question of convenience and memory footprint.
-r
Loading...