
RtpPusher C++ library
v2.0.0
Table of contents
- Overview
- Versions
- Library files
- Internal architecture
- Codec packetization
- User-data transmission (RTP modes)
- STANAG 4609 streaming (MPEG-TS modes)
- RtpPusher class description
- Build and connect to your project
- Preparing test input files
- Example
- How to play the stream
- How to extract user data from the stream
Overview
RtpPusher is a C++ library that delivers H.264 (RFC 6184), H.265 (RFC 7798) and JPEG (RFC 2435) video over UDP, in three transport modes selectable per send() call:
rtp(default) — codec-specific RTP framing (RFC 6184 / RFC 7798 / RFC 2435).mpegts— MPEG-2 Transport Stream over UDP per STANAG 4609 / MISB ST 1402, 7×188-byte TS packets per UDP datagram, with optional KLV metadata (MISB ST 0601) on its own PID.mpegts-rtp— MPEG-2 TS encapsulated in RTP per MISB ST 1403 (RFC 2250, payload type 33), 7 TS packets per RTP packet.
The library parses each input frame, splits it into RFC-conformant packets and emits them on a single shared UDP socket via a single internal pacing thread. The application calls only send(); everything else — Annex B start-code parsing, FU-A / FU fragmentation, JPEG segment extraction, MPEG-TS muxing (PAT/PMT/PCR/PES), KLV Tag 2 / Tag 1 auto-refresh, RTP header construction, sequence numbers, the 90 kHz timestamp clock, HEVC parameter-set replay on every IDR, drop-oldest-frame back-pressure — is handled internally. The library has no third-party runtime dependencies. The example application bundles the VSourceFile (source code included) library to read H.264 / H.265 / JPEG frames from files. Note: B-frames are not supported — input streams must contain I-frames and P-frames only.
Versions
Table 1 - Library versions.
| Version | Release date | What’s new |
|---|---|---|
| 1.0.0 | 15.07.2024 | - First version. |
| 1.1.0 | 12.12.2024 | - Fix bugs for Windows OS. - Update code structure. - Update example. |
| 1.1.1 | 03.04.2025 | - VSourceFile submodule update for example application. |
| 1.1.2 | 22.06.2025 | - UdpSocket update. |
| 1.2.0 | 25.07.2025 | - Add H.265 support. |
| 1.3.0 | 03.10.2025 | - Fix H.264 streaming issues. - Add user data streaming support. |
| 1.4.0 | 17.10.2025 | - Add JPEG support. - Fix user data insertion to H.265. |
| 1.4.1 | 24.10.2025 | - Fix documentation mistake about the SDP file for H.265. |
| 1.4.2 | 12.11.2025 | - Fix RTP packet calculation. |
| 1.5.0 | 27.12.2025 | - Add Muxed mode for user data transmission. |
| 1.6.0 | 16.01.2026 | - Added input frame buffering. |
| 1.6.1 | 02.02.2026 | - Reduced amount of allocated memory. |
| 1.6.2 | 03.02.2026 | - Fix memory allocation. |
| 1.6.3 | 06.02.2026 | - Added network overload control mechanism based on output buffer size monitoring. |
| 1.6.4 | 24.02.2026 | - Fixed output UDP buffer usage control algorithm to prevent network overflow. |
| 1.7.0 | 27.02.2026 | - Added FPS configuration for RTP timestamp calculation. |
| 1.8.0 | 18.03.2026 | - Removed deprecated bandwidthKbps init parameter.- Removed legacy time-slot pacing code path. - Updated API documentation and examples for the current back-pressure flow. |
| 1.8.1 | 13.04.2026 | - Fixed internal synchronization bugs. |
| 2.0.0 | 28.04.2026 | - API redesigned. - Added new mechanism for bitrate control to prevent network overload. - Added STANAG 4609 video stream standard support. |
Library files
The library is supplied as source code only. The user will be given a set of files in the form of a CMake project (repository). The repository structure is shown below:
CMakeLists.txt ------------- Main CMake file of the library.
src ------------------------ Folder with library source code.
CMakeLists.txt --------- CMake file of the library.
RtpPusher.h ------------ Main library header file.
RtpPusher.cpp ---------- C++ implementation file.
RtpPusherVersion.h ----- Header file with library version.
RtpPusherVersion.h.in -- CMake service file to generate version header.
example -------------------- Folder with test application files.
CMakeLists.txt --------- CMake file for example application.
main.cpp --------------- Source C++ file of example application.
3rdparty --------------- Folder with third-party libraries.
CMakeLists.txt ----- CMake file to include third-party libraries.
VSourceFile -------- Folder with VSourceFile library source code.
Internal architecture
┌─────────────────────┐ ┌──────────────────────────────┐ ┌──────────────────────┐
│ Application │ │ Packet ring buffer │ │ Pacing thread │
│ │ │ (2048 PacedPacket slots) │ │ │
│ send(data .., ip, │──────>│ [P1][P2][P3][P4]... │──────>│ sendto() over a │
│ port, fps, │ │ each slot carries dst+ts │ │ single shared UDP │
│ bitrate, ...) │ │ │ │ socket │
│ (non-blocking) │ │ drop-oldest-frame on full │ │ │
└─────────────────────┘ └──────────────────────────────┘ └──────────────────────┘
│ │ │
parses frame, per-destination state Mode A: bitrate pacing
builds RTP packets, (seq, ts counter, HEVC PS, Mode B: kernel SO_SNDBUF
one timestamp per frame SSRC, pacing config) back-pressure
Lifecycle
The class has no explicit constructor — a freshly-constructed object holds no system resources (no socket, no thread, no buffer). Resources are created lazily and torn down on stop():
| State | After… | Allocated resources |
|---|---|---|
| Uninitialised | construction | none |
| Uninitialised | stop() | none |
| Initialised | first send() after construction | UDP socket, ring buffer, pacing thread |
| Initialised | send() after stop() | new UDP socket, new buffer, new thread |
stop() method is safe to call on an already-stopped object. The destructor calls stop() automatically. The cumulative drop counter consulted by send() to flag SEND_FRAME_DROP is also preserved across stop()/send() cycles. send() and stop() must not be called concurrently. The typical pattern is single producer thread, stop() called by the same thread when shutting down.
Packet ring buffer and pacing thread
send() builds the RTP or MPEG-TS packets for a frame and copies each one into a fixed-size ring buffer (2048 slots, ~4 MB total) along with its destination address and the per-stream targetBitrateKbps. The pacing thread pulls packets out in batches of up to 16, releases the queue mutex, sends the batch via sendto(), then sleeps until the next deadline. This decouples frame parsing from network transmission: send() returns as soon as packets are enqueued, even on slow networks where the actual transmission may take >100 ms.
Per-destination state
The library maintains one StreamState per (ip, port) destination, keyed by ((ip << 16) | port). Each state holds:
- Video and metadata destination addresses, transmission mode for user-data.
- Next RTP video sequence number and next user-data sequence number.
- A 64-bit frame counter that drives the 90 kHz timestamp clock.
- Video and metadata SSRCs.
- Stream
fpsandtargetBitrateKbps(refreshed on everysend()). - Cached HEVC VPS, SPS and PPS NAL units (replayed before every IDR/CRA/BLA).
Because the state is per-destination, one RtpPusher instance can stream concurrently to several receivers, each with its own monotonic sequence and timestamp.
Pacing modes
The pacing mode is selected per-frame by the targetBitrateKbps argument to send():
Mode A — token-bucket pacing (targetBitrateKbps > 0). After emitting a batch of N bytes, the thread advances an absolute deadline by N * 8 / targetBitrateKbps seconds and sleep_untils on it. Using an absolute deadline rather than a relative sleep_for means a late wake-up does not push subsequent frames further out — the next batch catches up automatically.
Mode B — kernel buffer back-pressure (targetBitrateKbps <= 0). Every 16 sent packets the thread samples the kernel UDP send-buffer occupancy via ioctl(SIOCOUTQ). While occupancy is above 80 %, it sleeps with exponential back-off (500 µs → 5 ms, ≤ 15 retries) so the kernel has time to drain. This is the right mode when the network capacity is unknown or fluctuates and you only want to avoid overrunning the kernel queue.
Platform note — Mode B on Windows.
SIOCOUTQis a Linux-only ioctl; Windows has no public API to read the current occupancy of a UDP socket’s send buffer. On Windows the pacing thread therefore does not enforce the 80 % threshold — the explicit “sleep while buffer is above 80 %” loop becomes a no-op. Throttling still happens implicitly:sendto()on a default (blocking) Windows UDP socket blocks the pacing thread whenSO_SNDBUFfills up, which in practice yields a similar end-to-end effect (the producer eventually backs off via the ring-buffer drop policy), but without the smooth exponential back-off and without a configurable threshold. If you need precise rate control on Windows, prefer Mode A (targetBitrateKbps > 0).
Drop policy
The ring buffer holds 2048 RTP packets. When send() cannot fit a new frame, it drops the oldest queued frame as a unit — every packet sharing the oldest frameId in the queue is removed. This guarantees the queue never holds a partial frame, which would be undecodable. Two layers handle overflow:
- Proactive drop in
send(): estimates how many RTP packets the new frame will produce (frame.size / maxPayloadSize + 16slack for header overhead, HEVC parameter-set replay, and SEI), and drops oldest frames until it fits. - Per-packet safety net in
enqueuePacedPacket(): if the proactive estimate undercounts (very smallmaxPayloadSize, many tiny NALs), the per-packet enqueue path drops one more oldest frame to make room.
When any drop occurs during a send() call, send() returns SEND_FRAME_DROP (-2) instead of SEND_OK. The current frame still made it into the ring — the receiver simply sees a gap where the evicted older frames used to be. This indicates the producer is outrunning the network (encoder rate above pacing rate, or pacer disabled while the kernel buffer is saturated).
Stream start gating (H.264 / H.265)
Receivers cannot decode P-slices that reference frames they have never seen. If a receiver joins or starts mid-stream and the very first packet it gets is a P-slice, the decoder produces errors or black frames until the next IDR arrives.
To avoid emitting undecodable packets, send() enforces a wait-for-first-IDR rule on every H.264 / H.265 destination:
- The very first frame
send()accepts for a destination must contain an intra-coded NAL:- H.264:
nal_unit_type == 5(IDR slice). - H.265:
nal_unit_typein16..21(BLA / IDR / CRA — i.e. any IRAP).
- H.264:
- All frames before the first intra are silently dropped —
send()returnsSEND_OK, but no UDP packets are produced, no RTP timestamp counter advances, and no user-data is sent for them. - Once the first intra has been observed, all subsequent frames (including P-slices) flow through normally.
- The state is per-destination — different
(ip, port)destinations each track their own first-IDR independently. stop()clears the per-destination state, so astop()→send()restart waits for a fresh first IDR.
JPEG is exempt: every JPEG frame is independently decodable, so there is no gating for the JPEG codec.
Practical implication: an encoder driving the pusher should be configured to start with — and ideally periodically emit — an IDR (e.g. x264 -g <gop_size> -keyint_min 1, x265 -keyint <gop_size>). If the encoder is allowed to “warm up” with P-slices before its first IDR, those frames are silently discarded by the pusher — the receiver only sees video starting from the first IDR. To make a stream resumable / late-join-friendly without relying on this gating, configure the encoder to emit IDRs frequently (e.g. every 1–2 seconds for live streaming).
RTP timestamp clock
The 90 kHz RTP timestamp is recomputed for every frame from the per-stream frame counter:
timestamp = round(frameCount * 90000.0 / fps)
This avoids the cumulative drift that an integer-truncated per-frame increment ((int)(90000 / fps)) would cause for non-integer fps such as 23.976 or 29.97. Worst-case error is bounded by ±1 tick (≈ 11 µs) regardless of stream length.
Codec packetization
This section explains exactly what the library does to each input frame between send() returning and packets leaving the wire.
H.264 packetization (RFC 6184)
Input: raw H.264 Annex B bitstream — NAL units delimited by start codes (0x000001 or 0x00000001).
Step 0 — wait for first IDR. If this is the first frame for the destination and it does not contain an IDR slice (nal_unit_type == 5), the frame is silently dropped. See Stream start gating.
Step 1 — split into NAL units. The library scans the byte stream for start codes and produces a sequence of (pointer, size, header_byte) tuples. The 1-byte NAL header carries forbidden_zero_bit | nal_ref_idc | nal_unit_type; the type is the low 5 bits.
Step 2 — locate the last VCL NAL. VCL types are 1–5. Only the final RTP packet of the last VCL NAL gets the RTP marker bit set; non-VCL NALs (SPS=7, PPS=8, SEI=6, …) never carry the marker.
Step 3 — packetise each NAL with payload type 96:
- Single-NAL packet if
nalSize + 12 <= maxPayloadSize: the NAL is sent as-is in the RTP payload (RFC 6184 §5.6). - FU-A fragmentation (RFC 6184 §5.8) otherwise. Two-byte FU-A header per fragment:
- byte 0 (FU indicator) —
F=0 | NRI(2 bits copied from the NAL header) | Type=28 - byte 1 (FU header) —
S | E | R | NAL_typewhereS=1on the first fragment,E=1on the last - The original NAL header byte itself is not carried in the payload — the receiver reconstructs it from the FU indicator + FU header.
- byte 0 (FU indicator) —
Step 4 — set the marker bit on the final fragment of the last VCL NAL of the frame.
RTP timestamp is the same for every packet of one frame and is computed as described in RTP timestamp clock.
H.265 packetization (RFC 7798)
Input: raw H.265 Annex B bitstream. Same start-code framing as H.264.
Step 0 — wait for first IRAP. If this is the first frame for the destination and it does not contain an IRAP slice (nal_unit_type in 16..21: BLA / IDR / CRA), the frame is silently dropped. See Stream start gating.
Step 1 — split into NAL units. The 2-byte NAL header carries F | nal_unit_type(6) | nuh_layer_id(6) | nuh_temporal_id_plus1(3). The library extracts nal_unit_type = (byte0 & 0x7E) >> 1.
Step 2 — cache parameter sets. Whenever a VPS (type 32), SPS (type 33), or PPS (type 34) is observed, its bytes are stored per destination (not globally), so two receivers driven from one RtpPusher cannot overwrite each other’s caches.
Step 3 — replay parameter sets before every intra NAL. Intra types are 16–21 (IDR_W_RADL, IDR_N_LP, CRA_NUT, BLA_W_*). When such a NAL is about to be packetised, the cached VPS, SPS and PPS are emitted first as separate RTP packets. This lets late-join receivers decode without waiting for the next periodic parameter-set refresh from the encoder.
Step 4 — locate the last VCL NAL (types 0–31) for the marker-bit decision.
Step 5 — packetise each NAL with payload type 96:
- Single-NAL packet if
nalSize + 12 <= maxPayloadSize(RFC 7798 §4.4.1). - FU fragmentation (RFC 7798 §4.4.3) otherwise. Three-byte FU header per fragment:
- byte 0 (PayloadHdr-0) — original NAL byte 0 with
Fand highLayerIdpreserved (& 0x81) and Type rewritten to 49. - byte 1 (PayloadHdr-1) — original NAL byte 1 verbatim (low LayerId + TID).
- byte 2 (FU header) —
S | E | R | original_nal_type(6 bits)withS=1on the first fragment,E=1on the last.
- byte 0 (PayloadHdr-0) — original NAL byte 0 with
Step 6 — marker bit on the final fragment of the last VCL NAL of the frame.
JPEG packetization (RFC 2435)
Input: full JFIF JPEG frame (markers + entropy-coded scan data).
Step 1 — read DQT (quantisation tables). The library walks the byte stream looking for 0xFFDB markers and concatenates every quantisation table into a flat byte array (each Pq/Tq selector byte stripped, only the 64- or 128-byte table data kept). If the encoder emitted a single shared table (one of ffmpeg’s default behaviours when luma and chroma quantisation matrices are identical), the table is duplicated to satisfy strict RFC 2435 receivers that expect 128 bytes (luma + chroma) for Type 0/1.
Step 2 — read SOF (Start Of Frame). Width, height, and chroma subsampling are extracted from the SOF0 / SOF2 segment and converted to RFC 2435’s Type byte: 0 for 4:2:2, 1 for 4:2:0, 2 for 4:4:4 (anything else falls back to 0).
Step 3 — read DRI (Define Restart Interval). If a DRI segment is present, the high bit of Type is set (type += 64), signalling to the receiver that the packets carry a 4-byte Restart Marker header.
Step 4 — locate the entropy-coded scan data. The library scans past the SOS marker and reads forward until EOI (0xFFD9), correctly handling byte-stuffing (0xFF00 ↔ literal 0xFF) and restart markers (0xFFD0..0xFFD7, kept inline).
Step 5 — packetise the scan data with payload type 26 (RFC 2435 §3.1):
Each packet starts with the 8-byte JPEG RTP header:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|TypeSpec | Fragment Offset | 3 bytes
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Q | Width | Height| 1+1+1+1 bytes
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
TypeSpec = 0(always, for progressive JPEG)Fragment Offset(24-bit BE) — byte index in the scan data of the first byte carried by this packetType— chroma subsampling (with bit 7 set when restart markers are present)Q = 255— signals “DQT is inlined at the start of the first packet”Width / 8,Height / 8— dimensions in 8-pixel units (max 2040×2040)
If the JPEG had a DRI, a 4-byte Restart Marker header follows: RestartInterval(16) | F(1) L(1) RestartCount(14). Because the library does not align packet boundaries to restart-interval boundaries, it always sets F=1, L=1, RestartCount=0x3FFF (the “no info” value).
The first packet of the frame additionally carries an in-line DQT section (MBZ | Precision | Length(BE) | tableData…) right after the header, so the receiver doesn’t need an out-of-band quantisation table.
After all headers, the rest of the packet is filled with raw scan bytes. The RTP marker bit is set on the final packet of the frame.
User-data transmission (RTP modes)
In streamType="rtp" the userDataPort argument to send() selects how user-data (e.g. KLV) is delivered alongside the video. The choice is per-destination: the first send() to a new (ip, port) fixes the mode, subsequent calls reuse it.
userDataPort | Mode | Wire format |
|---|---|---|
0 | Embedded SEI NAL | One SEI NAL inside the video stream (H.264 type 6, H.265 type 39 PREFIX_SEI). Payload type user_data_unregistered (5) with a fixed 16-byte UUID prefix. JPEG has no SEI mechanism — the user-data is silently dropped. |
== port | Muxed in video | A separate RTP packet on the same UDP port, payload type 98, distinct SSRC (12346 vs 12345 for video). Receivers demux by SSRC/PT. |
!= port | Separate port | A separate RTP packet on a separate UDP port, payload type 98. |
Constraints:
userDataSize + 12 <= maxPayloadSize(the user-data must fit in one RTP packet for the muxed and separate-port modes).- For embedded SEI, the same constraint applies after the SEI overhead is added (UUID 16 B,
payloadType1 B,payloadSize1+ B,rbsp_trailing_bits1 B). - The user-data RTP packet carries the same RTP timestamp as the video frame, so receivers can synchronise it with the picture.
In MPEG-TS modes (mpegts / mpegts-rtp) the userDataPort argument is ignored: KLV always travels on its own PID inside the same UDP port (per MISB ST 1402). See STANAG 4609 streaming for details.
STANAG 4609 streaming (MPEG-TS modes)
STANAG 4609 is the standard for digital motion imagery, with the bitstream layer specified by the MISB family of documents (in particular MISB ST 1402 for raw TS-over-UDP and MISB ST 1403 for TS-over-RTP). RtpPusher implements both transports and the MISB ST 0601 KLV metadata layer.
Stream types and validity matrix
send() selects the transport per call via streamType:
streamType | Standard | Wire format |
|---|---|---|
"rtp" (or "") | RFC 6184 / 7798 / 2435 | Codec-specific RTP framing — the original library mode. |
"mpegts" | MISB ST 1402 | MPEG-2 TS over UDP, 7 × 188 = 1316 bytes per UDP datagram, no RTP. |
"mpegts-rtp" | MISB ST 1403 | MPEG-2 TS over RTP (RFC 2250, payload type 33), 12 + 7 × 188 = 1328 bytes per UDP datagram. |
Validity — which (codec × streamType) combinations work:
rtp | mpegts | mpegts-rtp | |
|---|---|---|---|
| H264 | yes | yes | yes |
| H265 | yes | yes | yes |
| JPEG | yes | no | no |
JPEG-in-MPEG-TS is rejected at the API level: STANAG 4609 / MISB ST 1402 do not define a TS stream_type for JPEG. Use streamType="rtp" for JPEG.
The streamType is latched per destination on the first send() to a given (ip, port). Subsequent calls with a different streamType return false. Call stop() to reset the latch (it also tears the socket / pacing thread / ring buffer down).
MPEG-TS multiplex layout
For both mpegts and mpegts-rtp modes the library produces the same TS multiplex; only the wrapping at the UDP boundary differs. PIDs are fixed:
| PID | Content |
|---|---|
0x0000 | PAT (Program Association Table) |
0x1000 | PMT (Program Map Table) |
0x0100 | Video ES (stream_type 0x1B for H.264, 0x24 for HEVC) |
0x0101 | KLV metadata ES (stream_type 0x15) with registration_descriptor "KLVA" |
0x1FFF | NULL packets (used to pad partial UDP datagrams to 7 TS) |
PAT/PMT are emitted at least every 100 ms (per common practice / DVB requirements). Continuity counters are tracked separately for each PID. The PMT always declares the KLV PID with the "KLVA" registration descriptor — even on frames where no metadata is supplied, so receivers see a stable PMT version and don’t need to re-parse it when KLV starts/stops appearing.
Each frame produces, in order:
[ PAT, PMT ] ← only when ≥ 100 ms since last PSI
[ Video TS packets, PCR on first ]
[ KLV TS packets ] ← only when userData ≠ nullptr
[ NULL TS padding ] ← to fill the last UDP datagram to 7 TS
All TS packets of one source frame carry the same frameId internally, so the drop policy treats the whole frame (video + KLV) as a unit when the ring buffer is full.
PCR generation
The PCR (Program Clock Reference) is computed from a per-stream steady_clock epoch captured on the first send() to a destination, converted to 27 MHz ticks. PCR is inserted in the adaptation field of the first video TS packet of each frame, so:
- The receiver gets at least one PCR per frame (e.g. one every 33 ms at 30 fps), well within the ≤ 100 ms repetition rate the DVB / MPEG-TS specs require.
- Jitter is bounded by the OS scheduler granularity. On Linux this is sub-millisecond by default; on Windows the library calls
timeBeginPeriod(1)ininitialize()(released instop()) to drop the scheduler quantum from the default ~15.6 ms to 1 ms. - The PCR clock is independent of wall-clock time — receivers use it for clock recovery, not absolute time. For absolute time (UTC) use the KLV Tag 2 Precision Time Stamp.
KLV metadata (UAS Datalink Local Set)
The library does not generate KLV — the application is responsible for forming a valid UAS Datalink Local Set (MISB ST 0601 LDS) with whatever tags the operational use case requires (Tag 1 Checksum, Tag 2 Precision Time Stamp, Tag 65 LDS Version Number, plus optional Tags 5/6/7 platform attitude, 13/14/15 sensor lat/lon/alt, 16/17 sensor FOV, 23/24/25 frame-centre geolocation, …).
Pass the KLV bytes via the userData / userDataSize parameters of send(). The library then:
- Wraps the KLV in a metadata PES (stream_id
0xFC) with PTS equal to the video PTS, splits it across TS packets on PID0x0101, and emits them right after the video TS packets of the current frame. - Auto-refreshes Tag 2 in place to the current UTC microseconds and recomputes Tag 1 (Checksum) per ST 0601 §6, so the wire bytes are MISB-fresh every frame even if the application only fills the buffer once.
- If the buffer is not a recognisable UAS LDS structure (wrong UL key, malformed BER length) or Tag 2 is missing, the auto-refresh is a soft no-op — the original bytes are still sent.
The auto-refresh mutates the user’s buffer in place: this is intentional so the application can preserve a single buffer of constant tags (lat/lon/heading from sensors) and let the library update Tag 2/Tag 1 each frame. The Universal Label expected for UAS LDS is 06 0E 2B 34 02 0B 01 01 0E 01 03 01 01 00 00 00; anything else short-circuits the timestamp update and the original bytes are sent unchanged.
MPEG-TS over UDP (MISB ST 1402)
streamType="mpegts". Every UDP datagram is exactly 7 × 188 = 1316 bytes of TS data, no RTP. Receivers see the multiplex as a continuous TS stream — the same way they would interpret the output of any standard MPEG-TS muxer (ffmpeg -f mpegts, VLC, GStreamer’s tsdemux).
MPEG-TS over RTP (MISB ST 1403)
streamType="mpegts-rtp". The same 1316-byte TS payload is preceded by a 12-byte RTP header (RFC 2250 / RFC 3551 payload type 33 — MP2T/90000):
| RTP field | Value |
|---|---|
| Version (V) | 2 |
| Padding (P) | 0 |
| Extension (X) | 0 |
| CSRC count (CC) | 0 |
| Marker (M) | 1 on RTP packets that contain a TS packet carrying a PCR (RFC 2250 §2.1); 0 otherwise. |
| Payload type | 33 (MP2T) |
| Sequence number | per-stream uint16 incrementing, starting at 0 on lazy init. |
| Timestamp | 90 kHz, equal to the inner video PES PTS. |
| SSRC | per-stream uint32 (default 12345). |
UDP datagram size is 12 + 1316 = 1328 bytes. A receiver pointed at this stream via an RTP/AVP 33 SDP file (a=rtpmap:33 MP2T/90000) will demux the inner TS exactly as in MISB ST 1402.
RtpPusher class description
RtpPusher class declaration
The RtpPusher.h file contains the RtpPusher class declaration:
namespace cr
{
namespace rtp
{
class RtpPusher
{
public:
/// Return codes for send().
static constexpr int SEND_OK = 0; ///< Frame queued.
static constexpr int SEND_INVALID_INPUT = -1; ///< Bad arguments.
static constexpr int SEND_FRAME_DROP = -2; ///< Older frames evicted.
static constexpr int SEND_MODE_MISMATCH = -3; ///< streamType differs from latch.
/// Get the library version string.
static std::string getVersion();
/// Destructor — calls stop() automatically.
~RtpPusher();
/// Send a coded video frame (with optional user-data) to a destination.
/// First call lazily allocates resources; subsequent calls reuse them.
/// streamType selects "rtp" (default) / "mpegts" / "mpegts-rtp".
/// Returns one of the SEND_* constants above.
int send(uint8_t* data,
size_t size,
const std::string& fourcc,
std::string ip,
uint16_t port,
uint16_t userDataPort,
float fps,
size_t maxPayloadSize = 1420,
int targetBitrateKbps = 5000,
uint8_t* userData = nullptr,
size_t userDataSize = 0,
const std::string& streamType = "rtp");
/// Tear down all resources (thread, socket, ring buffer, per-destination
/// state). The next send() call re-initialises from scratch.
void stop();
};
}
}
There is no explicit constructor — a default-constructed RtpPusher holds no system resources. Resource lifetime is managed by send() (lazy create) and stop() (tear down).
getVersion method
The getVersion() method returns a string of the current version of the RtpPusher class. Method declaration:
static std::string getVersion();
The method can be used without an RtpPusher class instance. Example:
std::cout << "RtpPusher version: " << cr::rtp::RtpPusher::getVersion() << std::endl;
Console output:
RtpPusher version: 2.0.0
send method
The send(…) method is the only entry point for streaming. It is non-blocking: it parses the input bitstream, builds RTP packets and enqueues them in the internal ring buffer, then returns. The pacing thread does the actual sendto() calls. Lazy initialisation. If this is the first send() since construction (or since the last stop()), the call opens the UDP socket, allocates the ring buffer and starts the pacing thread before processing the frame. Per-destination state (sequence numbers, frame counter for the 90 kHz clock, HEVC parameter-set cache, pacing config) is created on the first call to a new (ip, port) and reused on subsequent calls. The same instance can therefore stream to several receivers concurrently. Stream start gating (H.264 / H.265 only). Frames sent before the first IDR (H.264) / IRAP (H.265) on a destination are silently dropped: send() still returns SEND_OK, but no UDP traffic is produced and the per-stream RTP timestamp counter does not advance. Once the first intra-coded frame has been observed, all subsequent frames flow through. JPEG bypasses the gate (every JPEG frame is an I-frame). See Stream start gating for details. Method declaration:
int send(uint8_t* data,
size_t size,
const std::string& fourcc,
std::string ip,
uint16_t port,
uint16_t userDataPort,
float fps,
size_t maxPayloadSize = 1420,
int targetBitrateKbps = 5000,
uint8_t* userData = nullptr,
size_t userDataSize = 0,
const std::string& streamType = "rtp");
| Parameter | Description |
|---|---|
data | Pointer to the encoded video frame bytes. Must be in standard wire format: Annex B for H.264/H.265 (start codes 0x000001 / 0x00000001), full JFIF for JPEG. B-frames are not supported. Returns SEND_INVALID_INPUT if nullptr. |
size | Size of data in bytes. Returns SEND_INVALID_INPUT if 0. |
fourcc | Codec identifier — exactly one of the strings "H264", "H265", "JPEG" (case-sensitive). Any other value causes send() to return SEND_INVALID_INPUT. The library uses "H265" for HEVC. |
ip | Destination IPv4 address in dotted-decimal form ("192.168.1.42"). Returns SEND_INVALID_INPUT if the string does not parse — invalid input never silently sends to the local default route. |
port | Destination UDP port. Used for video in RTP modes; the only UDP port for the whole TS multiplex (video + KLV) in MPEG-TS modes. |
userDataPort | RTP modes: selects user-data delivery (see User-data transmission (RTP modes)): 0 = embedded SEI, == port = muxed on the same port, anything else = separate UDP port. Ignored in MPEG-TS modes (KLV always travels on PID 0x0101 inside the same UDP port). |
fps | Frames per second. Must be > 0; non-positive values are clamped to 30.0. Used as the divisor of the 90 kHz timestamp clock. Non-integer values (23.976, 29.97, …) are handled without cumulative drift — see RTP timestamp clock. |
maxPayloadSize | Maximum size of one RTP packet (RTP header + payload) in bytes. Default 1420 (safe for an Ethernet 1500 B MTU minus IPv4+UDP+RTP overhead). Out-of-range values (< 256 or > 1600) are silently clamped to 1420. Used in RTP mode only — MPEG-TS modes always emit fixed-size 1316-byte (mpegts) or 1328-byte (mpegts-rtp) UDP datagrams per ST 1402/1403. |
targetBitrateKbps | Pacer target rate in kilobits per second. Default 5000. Selects the pacing mode: > 0 = token-bucket pacing toward this rate (Mode A); <= 0 = kernel UDP send-buffer back-pressure (Mode B). The value is captured per packet at enqueue time, so changing it between frames immediately retargets the pacer. |
userData | Optional pointer to user-data bytes (KLV in MPEG-TS modes; arbitrary in RTP modes). Pass nullptr to send video only. In MPEG-TS modes: the buffer is mutated in place — Tag 2 (Precision Time Stamp) is refreshed to current UTC microseconds and Tag 1 (Checksum) is recomputed before transmission. |
userDataSize | Size of the user-data buffer in bytes. Pass 0 to send video only. In RTP modes the packet is silently dropped if userDataSize + 12 > maxPayloadSize. |
streamType | Container / transport mode (default "rtp"): "rtp" (or empty), "mpegts", "mpegts-rtp" — see STANAG 4609 streaming. The mode is latched per destination on the first call; subsequent calls with a different value return SEND_MODE_MISMATCH. |
Validity matrix for (fourcc × streamType):
rtp | mpegts | mpegts-rtp | |
|---|---|---|---|
| H264 | yes | yes | yes |
| H265 | yes | yes | yes |
| JPEG | yes | no | no |
JPEG-in-MPEG-TS is not defined by STANAG 4609 / MISB ST 1402 and is rejected at the API level.
Return value (int):
| Constant | Value | Meaning |
|---|---|---|
SEND_OK | 0 | Frame accepted and queued. Also returned for the pre-IDR/IRAP silent-drop window — the gate transparently waits for the first intra on H.264 / H.265 streams. |
SEND_INVALID_INPUT | -1 | Returned for: data == nullptr or size == 0; fourcc not one of "H264"/"H265"/"JPEG"; streamType not one of ""/"rtp"/"mpegts"/"mpegts-rtp"; ip not parseable as IPv4 dotted-decimal; or (fourcc, streamType) disallowed by the validity matrix (e.g. JPEG + mpegts). |
SEND_FRAME_DROP | -2 | The current frame was queued, but one or more older frames had to be evicted from the ring buffer to make room. The receiver will see a gap. Indicates the producer is outrunning the network (encoder above pacing rate, or back-pressure mode with a saturated kernel buffer). |
SEND_MODE_MISMATCH | -3 | streamType differs from the value latched on the first send() to this destination. Switching modes mid-stream would invalidate TS continuity counters, PSI tables, RTP sequence numbers and SSRCs all at once. Call stop() and start over to use a different mode. |
The four constants are declared as static constexpr int members of RtpPusher, so callers should compare against RtpPusher::SEND_OK etc. — the underlying integer values are documented for the table above but should not be hard-coded.
Parameter relationships and tuning hints:
fpsand the actual rate at which you callsend()must agree. The library does not rate-limitsend()itself — it only stamps timestamps on the basis offps. If you callsend()at 60 fps but passfps = 30, the stream’s 90 kHz clock advances at half the wall-clock rate and the receiver will play it back at 0.5× speed (or buffer up).targetBitrateKbpsshould reflect your channel capacity, not the codec’s nominal output bitrate. Set it well above the encoder’s average rate so paced bursts (large I-frames) can fan out within a frame interval; otherwise the ring buffer fills up and frames are dropped.- For low-bitrate channels (LTE, satellite) where you cannot characterise capacity, set
targetBitrateKbps <= 0to switch to back-pressure mode and let the kernel’s UDP buffer occupancy drive the cadence. maxPayloadSizeshould match your worst-case path MTU minus 28 (IPv4) or 48 (IPv6) bytes. The default 1420 covers Ethernet without VLAN tagging; drop it to 1200 for VPN/IPsec or to 950 for typical LTE bearer paths.- Per-destination state is sticky. To re-key a stream (new SSRC, fresh sequence numbers) on the same destination, call
stop()first.
Suggested bitrate ranges:
| Codec / resolution | targetBitrateKbps |
|---|---|
| H.264/H.265 720p | 2 000 – 5 000 |
| H.264/H.265 1080p | 5 000 – 10 000 |
| MJPEG 720p | 30 000 – 50 000 |
| MJPEG 1080p | 50 000 – 100 000 |
stop method
The stop() method tears down every resource owned by the pusher: it joins the pacing thread, closes the UDP socket, frees the packet ring buffer and clears all per-destination state (sequence numbers, frame counters, HEVC parameter-set cache). Method declaration:
void stop();
After stop() returns, the object is back in its uninitialised default state — calling send() again lazily re-creates everything from scratch with fresh sequence numbers and SSRCs. This is the intended way to re-key a stream or to release sockets while keeping the object around.
stop() is idempotent and safe to call on an already-stopped (or never-used) instance. The destructor calls stop() automatically, so explicit shutdown is optional.
The cumulative drop counter that send() consults to flag SEND_FRAME_DROP is not reset — it is statistical state, not a resource, and survives across stop()/send() cycles.
stop() must not be called concurrently with send().
Build and connect to your project
Typical commands to build RtpPusher library:
cd RtpPusher
mkdir build
cd build
cmake ..
make
If you want to connect RtpPusher library to your CMake project as source code, you can do the following. For example, if your repository has structure:
CMakeLists.txt
src
CMakeList.txt
yourLib.h
yourLib.cpp
Create a 3rdparty folder and copy the RtpPusher repository folder there. New structure of your repository:
CMakeLists.txt
src
CMakeList.txt
yourLib.h
yourLib.cpp
3rdparty
RtpPusher
Create a CMakeLists.txt file in the 3rdparty folder. CMakeLists.txt should contain:
cmake_minimum_required(VERSION 3.13)
################################################################################
## 3RD-PARTY
## dependencies for the project
################################################################################
project(3rdparty LANGUAGES CXX)
################################################################################
## SETTINGS
## basic 3rd-party settings before use
################################################################################
# To inherit the top-level architecture when the project is used as a submodule.
SET(PARENT ${PARENT}_YOUR_PROJECT_3RDPARTY)
# Disable self-overwriting of parameters inside included subdirectories.
SET(${PARENT}_SUBMODULE_CACHE_OVERWRITE OFF CACHE BOOL "" FORCE)
################################################################################
## INCLUDING SUBDIRECTORIES
## Adding subdirectories according to the 3rd-party configuration
################################################################################
add_subdirectory(RtpPusher)
The 3rdparty/CMakeLists.txt file adds the RtpPusher folder to your project and excludes the example from compiling (by default the example is excluded from compiling if RtpPusher is included as a sub-repository). The new structure of your repository will be:
CMakeLists.txt
src
CMakeList.txt
yourLib.h
yourLib.cpp
3rdparty
CMakeLists.txt
RtpPusher
Next, you need to include the 3rdparty folder in the main CMakeLists.txt file of your repository. Add the following line at the end of your main CMakeLists.txt:
add_subdirectory(3rdparty)
Next, you have to include the RtpPusher library in your src/CMakeLists.txt file:
target_link_libraries(${PROJECT_NAME} RtpPusher)
Done!
Preparing test input files
The example application reads raw bitstream files (.h264, .h265, .mjpeg) — not containers like .mp4 — because the bundled VSourceFile library parses Annex B start codes / JPEG markers directly and feeds each frame to RtpPusher::send() without re-encoding. RtpPusher additionally requires the encoded video to contain no B-frames: the library writes RTP timestamps in display order, which would mismatch the coded order produced by an encoder that emits B-frames, and the receiver would see DTS/PTS warnings or out-of-order playback.
The recipes below convert any .mp4 you have lying around (e.g. static/test.mp4) into matched-fps H.264/H.265/MJPEG files at 1280×720, 1000 frames each, with B-frames disabled at the encoder. The same pattern works for any source resolution / frame count — adjust the filters and -frames:v accordingly.
Run all commands from the directory holding test.mp4 (or pass an absolute path).
H.264 (.h264)
for fps in 30 60 90; do
ffmpeg -y -i test.mp4 \
-c:v libx264 \
-bf 0 \
-profile:v high \
-pix_fmt yuv420p \
-vf "scale=1280:720,fps=${fps}" \
-frames:v 1000 \
-an \
-f h264 \
test_${fps}fps.h264
done
Why each flag matters:
| Flag | Purpose |
|---|---|
-c:v libx264 | Use the x264 encoder. |
-bf 0 | Disable B-frames. This is the critical option — without it x264 emits ~3 B-frames per GOP by default. |
-profile:v high | High profile is fine without B-frames; pick main if you need wider decoder compatibility (e.g. some hardware decoders). |
-pix_fmt yuv420p | 4:2:0 8-bit, the chroma layout every common H.264 decoder accepts. |
-vf "scale=...,fps=N" | Resize and convert to the exact target frame rate. Pinning fps here makes the resulting bitstream’s per-frame timestamp interval match what you’ll later pass to send(..., fps=N, ...). |
-frames:v 1000 | Cap output at 1000 frames so the test files stay small and bounded. |
-an | Drop audio (no place for it in a raw H.264 elementary stream). |
-f h264 | Force raw Annex B output — without this, ffmpeg picks a container based on the .h264 extension. |
H.265 (.h265)
for fps in 30 60 90; do
ffmpeg -y -i test.mp4 \
-c:v libx265 \
-x265-params "bframes=0:log-level=error" \
-pix_fmt yuv420p \
-vf "scale=1280:720,fps=${fps}" \
-frames:v 1000 \
-an \
-f hevc \
test_${fps}fps.h265
done
Notes:
-x265-params bframes=0is the HEVC equivalent of-bf 0. Mandatory for RtpPusher.log-level=errorsilences x265’s noisy info banner; drop it if you want to see encoding stats.-f hevcforces a raw HEVC Annex B elementary stream.
MJPEG (.mjpeg)
MJPEG is a sequence of independent JPEG frames concatenated into one file. There are no inter-frame dependencies, so “B-frames” are not a concept here, but the JPEG bitstream still needs careful options to be RTP-friendly:
for fps in 30 60 90; do
ffmpeg -y -i test.mp4 \
-c:v mjpeg \
-huffman default \
-force_duplicated_matrix 1 \
-q:v 5 \
-pix_fmt yuvj420p \
-vf "scale=1280:720,fps=${fps}" \
-frames:v 1000 \
-an \
-f mjpeg \
test_${fps}fps.mjpeg
done
Why these MJPEG-specific flags matter for RFC 2435:
| Flag | Purpose |
|---|---|
-huffman default | Use the standard JPEG Huffman tables. ffmpeg’s default is optimal, which writes a custom DHT segment per frame. RFC 2435 receivers reconstruct JPEGs with the standard DHT, so optimal causes “error y=… x=…” decode errors at the receiver. |
-force_duplicated_matrix 1 | Emit separate luma and chroma quantisation tables (DQT) even when they are identical. Strict RFC 2435 receivers expect 128 bytes of quantisation data (luma + chroma) for Type 0/1; a single shared 64-byte table fails on them. RtpPusher also internally duplicates a single shared table as a safety net, but generating both up-front is cleaner. |
-q:v 5 | Quality level (1 = best, 31 = worst). Adjust for your bandwidth/test needs; 5 is visually lossless at 720p. |
-pix_fmt yuvj420p | JPEG full-range 4:2:0 — the standard for MJPEG. The j suffix marks full range, which most JPEG decoders expect. |
-f mjpeg | Force raw concatenated-JPEG output (one Start Of Image / End Of Image pair per frame). |
For 1080p MJPEG, swap 1280:720 for 1920:1080 and bump the streaming targetBitrateKbps accordingly (50 000 – 100 000 kbps).
Verifying that no B-frames are present
After generating the files, confirm there are no B-frames with ffprobe:
ffprobe -v error -select_streams v:0 -show_frames -show_entries frame=pict_type \
-of csv=p=0 test_30fps.h264 | sort -u
Expected output:
I
P
If you also see B, the encoder ignored the disable flag (e.g. you passed -bf 0 for libx265 instead of -x265-params bframes=0, or vice versa). Re-encode before feeding the file to RtpPusher.
For MJPEG, every frame is I by construction — ffprobe will simply print I.
To double-check the H.265 bitstream, you can also inspect pict_type distribution:
ffprobe -v error -select_streams v:0 -show_frames -show_entries frame=pict_type \
-of csv=p=0 test_30fps.h265 | sort | uniq -c
A typical 1000-frame run gives ~30 I (one per ~33-frame GOP) and ~970 P, with zero B.
Example
A minimal application that streams an H.264 file to 127.0.0.1:7032. The example uses the bundled VSourceFile library to read frames from a file.
#include <iostream>
#include <thread>
#include <chrono>
#include "VSourceFile.h"
#include "RtpPusher.h"
int main()
{
// Open the input file (default 1280x720, 30 fps).
cr::video::VSourceFile source;
if (!source.openVSource("test.h264;1280;720;30"))
return -1;
// No init() / open() — the pusher allocates resources lazily on first send().
cr::rtp::RtpPusher rtpPusher;
cr::video::Frame frame;
while (true)
{
if (!source.getFrame(frame, 1000))
continue;
// Map the source's fourcc enum to the string accepted by send().
// The library uses "H265" (not "HEVC") for H.265.
std::string codec;
switch (frame.fourcc)
{
case cr::video::Fourcc::H264: codec = "H264"; break;
case cr::video::Fourcc::HEVC: codec = "H265"; break;
case cr::video::Fourcc::JPEG: codec = "JPEG"; break;
default: continue;
}
const float fps = 30.0f;
const uint16_t port = 7032;
const uint16_t userDataPort = 0; // embed metadata as SEI
const int maxPayloadSize = 1420;
const int targetBitrateKbps = 5000; // ~5 Mbps token-bucket pacing
const int rc = rtpPusher.send(frame.data,
static_cast<size_t>(frame.size), codec,
"127.0.0.1", port, userDataPort,
fps, maxPayloadSize, targetBitrateKbps);
switch (rc)
{
case cr::rtp::RtpPusher::SEND_OK:
std::cout << "Frame: " << frame.frameId << "\n";
break;
case cr::rtp::RtpPusher::SEND_FRAME_DROP:
std::cerr << "[drop] frame " << frame.frameId
<< " queued, but older frames evicted "
"(producer outrunning the network)\n";
break;
case cr::rtp::RtpPusher::SEND_INVALID_INPUT:
std::cerr << "send() rejected the frame: invalid input\n";
continue;
case cr::rtp::RtpPusher::SEND_MODE_MISMATCH:
std::cerr << "send() rejected the frame: streamType "
"differs from the latched value\n";
continue;
}
std::this_thread::sleep_for(std::chrono::milliseconds(33)); // ~30 fps
}
// rtpPusher.stop() is called automatically by the destructor.
return 0;
}
Bundled interactive example
The interactive example shipped under example/main.cpp is intentionally minimal — it asks just three questions and infers everything else from the test files in static/:
=== RtpPusher example ===
Before running, copy the test_*.{h264,h265,mjpeg} and *.sdp
files from RtpPusher/static into the same directory as this
executable.
Available test files:
1. test_30fps_1920x1080_5000kbps.h264 (H264, 30 fps, 5000 kbps)
2. test_60fps_1920x1080_5000kbps.h264 (H264, 60 fps, 5000 kbps)
3. test_90fps_1920x1080_5000kbps.h264 (H264, 90 fps, 5000 kbps)
4. test_30fps_1920x1080_5000kbps.h265 (H265, 30 fps, 5000 kbps)
5. test_60fps_1920x1080_5000kbps.h265 (H265, 60 fps, 5000 kbps)
6. test_90fps_1920x1080_5000kbps.h265 (H265, 90 fps, 5000 kbps)
7. test_30fps_1920x1080_20000kbps.mjpeg (JPEG, 30 fps, 20000 kbps)
8. test_60fps_1920x1080_20000kbps.mjpeg (JPEG, 60 fps, 20000 kbps)
9. test_90fps_1920x1080_20000kbps.mjpeg (JPEG, 90 fps, 20000 kbps)
Select file [1-9]:
Destination IP [127.0.0.1]:
Stream type (rtp/mpegts/mpegts-rtp) [rtp]:
Workflow:
- Copy all
test_*.{h264,h265,mjpeg}and*.sdpfiles fromstatic/into the directory of the builtRtpPusherExamplebinary, then run the binary from that directory. The app opens fixtures by relative filename (the bundledVSourceFilelibrary reads them as raw Annex B / JFIF — no.mp4containers). - Pick an index
1-9. The app readsfpsand the source bitrate from a hard-coded catalogue (seeg_fixturesat the top ofexample/main.cpp) — there is no filename parsing. - Initialise
VSourceFilewith<filename>;1920;1080;<fps>. Failure aborts with a clear error. - Enter destination IP and
streamType. Up-front guard:JPEG + mpegts*is rejected immediately with a helpful message (the library would also returnSEND_INVALID_INPUT, but bailing here gives a friendlier signal). - Stream to
udp://<ip>:7032. The app fixes:port = 7032,userDataPort = 7032,maxPayloadSize = 1420.targetBitrateKbps = 2 × source_bitrate(10 000 kbps for H.264/H.265, 40 000 kbps for MJPEG) — twice the source rate gives the pacer enough headroom to absorb I-frame bursts without forcing the proactive-drop path.- KLV is auto-attached only in MPEG-TS modes, where it travels on its own PID (0x101) per STANAG 4609 / MISB ST 1402. In
rtpmode the example sends video only, because muxing KLV on the same UDP port with a distinct PT/SSRC confuses off-the-shelf RTP demuxers (e.g. ffmpeg with the bundled video-only.sdpfiles), which would interpret the metadata packets as out-of-order video and emit “RTP: dropping old packet received too late” warnings.
- The app prints one line per frame:
Frame: <id>onSEND_OK.[drop] frame <id> queued, but older frames evicted (producer outrunning the network).onSEND_FRAME_DROP.send: invalid input .../send: streamType differs from the latched value ...on the other negative codes.
Streaming the same source as STANAG 4609 (MPEG-TS + KLV) — the only changes are passing streamType="mpegts" and a KLV buffer:
// Example KLV (MISB ST 0601 LDS): UAS UL + Tag 65 (LDS Version) + Tag 2
// (PrecisionTimeStamp, refreshed by the library each frame) + Tag 1
// (Checksum, recomputed by the library each frame). Real applications
// add Tag 5/6/7 platform attitude, Tag 13/14/15 sensor lat/lon/alt,
// Tag 16/17 FOV, Tag 23/24/25 frame-centre geolocation, etc.
uint8_t klv[34] = {
// 16-byte UAS Datalink LDS Universal Label
0x06, 0x0E, 0x2B, 0x34, 0x02, 0x0B, 0x01, 0x01,
0x0E, 0x01, 0x03, 0x01, 0x01, 0x00, 0x00, 0x00,
// outer BER length = 17 bytes
0x11,
// Tag 65 (UAS LDS Version) = 1
0x41, 0x01, 0x01,
// Tag 2 (PrecisionTimeStamp) — placeholder, library refreshes
0x02, 0x08, 0,0,0,0,0,0,0,0,
// Tag 1 (Checksum) — placeholder, library recomputes
0x01, 0x02, 0,0
};
const int rc = rtpPusher.send(frame.data,
static_cast<size_t>(frame.size), codec,
"127.0.0.1", /*port=*/7032, /*userDataPort=*/0,
/*fps=*/30.0f,
/*maxPayloadSize=*/1420,
/*targetBitrateKbps=*/10000,
/*userData=*/klv, /*userDataSize=*/sizeof(klv),
/*streamType=*/"mpegts");
if (rc < 0)
std::cerr << "send() returned " << rc << "\n"; // SEND_INVALID_INPUT,
// SEND_FRAME_DROP, or
// SEND_MODE_MISMATCH
Replace "mpegts" with "mpegts-rtp" to switch to the MISB ST 1403 transport.
How to play the stream
The receiving setup depends on the streamType selected at the sender.
RTP mode (default)
streamType="rtp". To play with ffplay or VLC, you need an SDP file describing the stream (ready-made examples for H.264, H.265 and JPEG are in the /static folder). Important: replace the IP address and port in the SDP file with the values used in your application.
For H.264:
s=RTP H264 Stream
c=IN IP4 127.0.0.1
m=video 7032 RTP/AVP 96
a=rtpmap:96 H264/90000
For H.265:
s=RTP H265 Stream
c=IN IP4 127.0.0.1
m=video 7032 RTP/AVP 96
a=rtpmap:96 H265/90000
For JPEG:
s=RTP JPEG Stream
c=IN IP4 127.0.0.1
m=video 7032 RTP/AVP 26
a=rtpmap:26 JPEG/90000
ffplay -protocol_whitelist file,udp,rtp -i h264.sdp
ffmpeg -protocol_whitelist file,udp,rtp -i h264.sdp -f null -stats -
In VLC, open the SDP file (e.g. h264.sdp) directly.
MPEG-TS over UDP (mpegts)
streamType="mpegts". The wire format is plain MPEG-TS; no SDP file is required. Receivers consume the UDP stream directly:
# ffmpeg / ffprobe — point at the UDP socket. The `?fifo_size=...` is
# optional but recommended on lossy networks.
ffmpeg -i udp://@:7032 -f null -stats -
ffprobe -i udp://@:7032 -show_streams
# ffplay — same syntax, with display.
ffplay -i udp://@:7032
# VLC — open URL: udp://@:7032
The PMT will advertise the video stream (PID 0x100, codec H.264 or H.265) and the KLV stream (PID 0x101, codec_tag KLVA).
For deeper compliance analysis use TSDuck (free, open source — tsduck.io):
tsp -I ip 127.0.0.1:7032 -P analyze -O drop # full structure summary
tsp -I ip 127.0.0.1:7032 -P pcrextract --csv -O drop # PCR jitter
tsp -I ip 127.0.0.1:7032 -P continuity -O drop # CC integrity
MPEG-TS over RTP (mpegts-rtp)
streamType="mpegts-rtp". Same TS multiplex as above but wrapped in RTP/AVP 33. Receivers need an SDP file:
s=STANAG 4609 MPEG-TS over RTP
c=IN IP4 127.0.0.1
m=video 7032 RTP/AVP 33
t=0 0
a=rtpmap:33 MP2T/90000
ffplay -protocol_whitelist file,udp,rtp -i mp2t.sdp
ffmpeg -protocol_whitelist file,udp,rtp -i mp2t.sdp -f null -stats -
VLC also accepts this SDP. TSDuck reads RTP-over-TS via tsp -I rtp 127.0.0.1:7032.
How to extract user data from the stream
The extraction method depends on the streamType (and, for RTP modes, the userDataPort value) passed to send().
Extracting from Separate Port Mode (RTP)
If user data is sent over its own RTP stream (userDataPort != 0 and userDataPort != port), receive it with gstreamer:
gst-launch-1.0 udpsrc port=7034 caps="application/x-rtp, media=application, encoding-name=SMPTE336M, clock-rate=90000, payload=98" \
! rtpjitterbuffer ! rtpklvdepay ! fakesink dump=true
Where 7034 is the metadata port (the userDataPort argument), and 98 is the payload type used for user-data.
Extracting from Muxed Mode (RTP)
If user-data is muxed on the same port as video (userDataPort == port), the metadata packets carry payload type 98 and SSRC 12346, while video packets carry payload type 96 (or 26 for JPEG) and SSRC 12345. Demux on the receiver side by SSRC or payload type.
Extracting from Embedded SEI Mode (RTP)
If user-data is embedded in SEI NALs (userDataPort == 0), it is wrapped in a user_data_unregistered SEI message (payload type 5) with the following 16-byte UUID prefix:
A9 B1 9F 13 44 46 EC 4D 8C BF 65 B1 E1 2D CF FD
Use FFmpeg, GStreamer’s SEI parser, or a custom NAL parser to extract it from the H.264 / H.265 video stream. JPEG has no SEI mechanism — embedded mode is silently a no-op for JPEG sources.
Extracting KLV from MPEG-TS (mpegts / mpegts-rtp)
In mpegts and mpegts-rtp modes the KLV travels on PID 0x101 with stream_type 0x15 and a registration_descriptor "KLVA" in the PMT. Receivers identify it as codec_tag KLVA (codec_name klv in ffprobe).
Extract the raw KLV bytes with ffmpeg:
# mpegts mode (read directly from UDP socket)
ffmpeg -i udp://@:7032 -map 0:d -c copy klv.bin
# mpegts-rtp mode (read via SDP)
ffmpeg -protocol_whitelist file,udp,rtp -i mp2t.sdp \
-map 0:d -c copy klv.bin
Validate the KLV content with the klvdata Python package (recommended for CI):
pip install klvdata
import klvdata
with open("klv.bin", "rb") as f:
packets = list(klvdata.StreamParser(f))
mandatory = {1, 2, 65} # Checksum, Precision Time Stamp, LDS Version
prev_ts = 0
for i, pkt in enumerate(packets):
tags = {item.key[0] for item in pkt.items()}
assert mandatory.issubset(tags), f"pkt {i}: missing {mandatory - tags}"
for item in pkt.items():
if item.key[0] == 2: # Tag 2 = uint64 BE
ts = int.from_bytes(item.value, "big")
assert ts >= prev_ts, f"pkt {i}: Tag 2 went backwards"
prev_ts = ts
print(f"OK: {len(packets)} KLV packets pass MISB ST 0601 validation")
For TS-level structural compliance use TSDuck:
tsp -I file capture.ts -P analyze -O drop # full PSI + bitrate
tsp -I file capture.ts -P tables --pid 0x1000 -O drop # PMT (KLVA descriptor visible)
tsp -I file capture.ts -P pcrextract --csv -O drop # PCR jitter
For final compliance certification (e.g. before delivering to a defence customer), the official validators are JMITT (Java MISB Image Test Tool) and MIST from NGA — both free, both behind a registration wall at nsgreg.nga.mil.
Output Example (Separate Port Mode, RTP)
00000000 (0x7f9f24014d44): 06 0e 2b 34 01 01 01 01 01 05 02 00 00 00 00 00 ..+4............
00000010 (0x7f9f24014d54): 10 59 65 73 74 65 72 64 61 79 73 20 57 6f 72 6c .Yesterdays Worl
00000020 (0x7f9f24014d64): 64 d