Internet-Draft Abbreviated Title February 2026
Whited Expires 21 August 2026 [Page]
Workgroup:
Internet Engineering Task Force
Internet-Draft:
draft-swhited-ogg-stems-00
Published:
Intended Status:
Informational
Expires:
Author:
ssw. Whited, Ed.

OGG Stem Files

Abstract

This document defines a multi-track profile of the OGG container format for storing stems that is also backwards compatible with existing media players.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 21 August 2026.

Table of Contents

1. Introduction

Stem are recordings of individual instruments, or clusters of instruments, used by DJs and music producers for live mixing of music. Historically stem files have been stored as individual audio files, or using patent-encumbered or vendor specific proprietary container formats. The OGG file format developed by the Xiph.Org Foundation was formally specified in [RFC3533] and [RFC5334] and is ideally situated as a container for stems. This specification documents a profile for the Ogg container format that allows it to store lossless or lossy stems as well as metadata about the stems for use in DJ applications or Digital Audio Workstations.

1.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

2. Requirements

STEM files have a few basic requirements:

3. Bitstream Layout

3.1. Audio Streams

Each stem file may contain an arbitrary number of logical bitstreams containing audio and MUST include at least 3 streams (the original audio and at least two stems). Each stream MUST be encoded using the same codec with the same parameters including bitrate, channel number, channel layout, and sample rate.

The first logical bitstream MUST be the final post-mix, mastered audio. This helps preserve backwards compatibility in media players which do not support a [Skeleton] bitstream. The remaining logical bitstreams will be the stems and MUST have the same audio length as the first logical bitstream. For example, if the original logical bitstream is 3 minutes long and the stem file includes a percussion track but the percussion does not start until minute 2 the percussion stem would still be 3 minutes long but would contain a minute of silence at the start of the track.

3.2. Skeleton

Stem files MUST contain a [Skeleton] bitstream. For each fisbone secondary header packet describing a stem logical bitstream (ie. not the fisbone packet describing the first stream containing the post-mix audio) the following message headers are defined:

Table 1
Message Header Requirement Level Description
Role REQUIRED MUST always be "audio/stem"
Title REQUIRED Free text, used for the stem name (eg. "Percussion")
Stem-color OPTIONAL Color representing this track in RGB hex format, eg. "#145374"

The fisbone secondary header packet describing the first logical bitstream containing the main audio MUST set the "Role" message header to "audio/main".

4. Mixing

The stem track SHOULD NOT have any gain normalization applied. Instead they should retain the same levels as they would have in the final mix present in the first track so that if all stems were played at unity gain the levels would be equivalent to the final mix.

5. Mastering

Because mastering happens post-mix and the stems are pre-mix audio the stem tracks SHOULD NOT have any mastering steps applied. Instead, metadata for configuring a compressor and limiter SHOULD be included in the stem file. After mixing the stems applications MAY choose to feed the mix through a Digital Signal Processor configured with the limiter and compressor settings read from the metadata.

5.1. Compressor Metadata

Metadata used for configuring the compressor should be stored alongside the stem files global metadata (ie. in the primary VorbisComment).

Table 2
Tag Requirement Level Values
STEM:COMPRESSOR:ENABLED REQUIRED "TRUE" or "FALSE"
STEM:COMPRESSOR:RATIO OPTIONAL TODO
STEM:COMPRESSOR:OUTPUT_GAIN OPTIONAL TODO
STEM:COMPRESSOR:THRESHOLD OPTIONAL TODO
STEM:COMPRESSOR:ATTACK OPTIONAL TODO
STEM:COMPRESSOR:INPUT_GAIN OPTIONAL TODO
STEM:COMPRESSOR:RELEASE OPTIONAL TODO
STEM:COMPRESSOR:HP_CUTOFF OPTIONAL TODO
STEM:COMPRESSOR:HP_DRY_WET OPTIONAL TODO

5.2. Limiter Metadata

Metadata used for configuring the limiter should be stored alongside the stem files global metadata (ie. in the primary VorbisComment).

Table 3
Tag Requirement Level Values
STEM:LIMITER:ENABLED REQUIRED "TRUE" or "FALSE"
STEM:LIMITER:RELEASE OPTIONAL TODO
STEM:LIMITER:THRESHOLD OPTIONAL TODO
STEM:LIMITER:CEILING OPTIONAL TODO

6. IANA Considerations

This memo includes no request to IANA.

7. Security Considerations

This document should not affect the security of the Internet.

8. References

8.1. Normative References

[RFC3533]
Pfeiffer, S., "The Ogg Encapsulation Format Version 0", RFC 3533, DOI 10.17487/RFC3533, , <https://www.rfc-editor.org/info/rfc3533>.
[RFC5334]
Goncalves, I., Pfeiffer, S., and C. Montgomery, "Ogg Media Types", RFC 5334, DOI 10.17487/RFC5334, , <https://www.rfc-editor.org/info/rfc5334>.
[Skeleton]
Xiph.Org Foundation, "OGG Skeleton 4", , <https://wiki.xiph.org/Ogg_Skeleton_4>.

8.2. Informative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.

Author's Address

Sam Whited (editor)