IP Cameras & HLS

IP cameras tend to produce a small number of RTMP/RTSP streams, and it is difficult to get a web client to view these, and anyway one would probably be limited to about half a dozen clients at most.

The general solution seems to be to convert to HLS, and stream from a generic web-server. This can be done with very little CPU power from Linux.

HLS

HLS works by splitting the incoming video into short chunks, of around 3-5 seconds. The client downloads these in the usual way over HTTP, and displays them sequentially. An index file is also maintained, and constantly re-fetched by the client, and this gives the names of the next segments to fetch. The client likes to keep several segments in reserve to cover network glitches, and this results in a time lag which is around 30 seconds, and will vary significantly between clients.

One significant advantage of HLS is that it can offer multiple streams with different bitrates / resolutions, and let the client choose depending on the resolution it is displaying at, and the connectivity it has.

Low cost computational cost HLS

Generally the advice is to use something like ffmpeg to decompress the webcam's stream, and then to recompress it at multiple different resolutions. However, one can often get by on low-cost hardware without needing to transcode.

Most webcams will produce both a high bandwidth "main" stream, and a low bandwidth "sub" stream, or preview. If both of these are available in a codec that HLS supports (i.e. H264, not H265 or MJPEG), it can be possible to convert these into an HLS stream with two bandwidths on offer with no transcoding.

Reolink

Using a Reolink camera, the rather lengthy line

ffmpeg -i 'rtmp://192.168.1.79/bcs/channel0_main.bcs?channel=0&stream=0&user=admin&password=' \
-i 'rtmp://192.168.1.79/bcs/channel0_sub.bcs?channel=0&stream=1&user=admin&password=' \
-c:v copy -b:v:0 8000k -b:v:1 400k -map 0:1 -map 0:2 -map 1:1 -map 1:2 \
-f hls -window_size 10 -remove_at_exit 1 -hls_playlist 1 \
-master_pl_name master.m3u8 -hls_flags delete_segments \
-hls_delete_threshold 3 -var_stream_map "v:0,a:0 v:1,a:1" \
/dev/shm/streaming/media_%v.m3u8

works.

Points to note include the use of /dev/shm for the output. This means that the files get stored in a virtual disk drive actually made of physical RAM. Fast, low power, no disk wear. It will fill up though, so we delete old HLS segments.

For this camera the main stream is 2560x1920, and the sub stream is 640x480. It seems that ffmpeg needs to be told their bandwidths for advertisement in the HLS file, here 8000kbit/s and 400kbit/s.

HLS defaults to storing the video and audio in separate fragment files. Multiple video streams of different bandwidths can then reference the same audio stream. However, I am unconvinced this is advantageous. It requires the client to fetch two data files, one video and one audio, per segment, and it allows the client to get confused about which audio segment to play with which video segment. If you find yourself viewing an HLS stream and the audio is lagging about 5 seconds behind the video, this is probably the problem, and restarting the client may help.

The specified var_stream_map of v:0,a:0 v:1,a:1 produces just two outputs, the first containing both audio and video from the camera's high bandwidth stream, and the second the same for the low bandwidth stream.

This camera presents three substreams in each RTMP stream. Ffmpeg will regard these are being numbered from zero, and the first such substream contains some unidentified data, we actually want the second (video) and third (audio). The map options select these.

Annke

A second camera, badged by Annke, produced an additional issue. Its streams are at the more conventional 16:9 ratio, with the main stream being 2560x1440. However, the substream does not offer a 16:9 pixel ratio, but does offer 640x480. The image it produces should be displayed at 16:9, it is simply using non-square pixels.

There is nothing wrong with non-square pixels, and most DVDs use them. However, there is something wrong with not advertising to the client that the pixels are not square, and this camera seems not to do so. Fortunately ffmpeg can add the required metadata without transcoding.

ffmpeg -rtsp_transport tcp \
-i 'rtsp://admin:password@192.168.1.80:554/H264/ch1/main/av_stream' \
-i 'rtsp://admin:password@192.168.1.80:554/H264/ch1/sub/av_stream' \
-c:v copy -bsf:v:1 "h264_metadata=sample_aspect_ratio=4/3" \
-b:v:0 7000k -b:v:1 512k -map 0:0 -map 0:1 -map 1:0 -map 1:1 \
-f hls -window_size 10 -remove_at_exit 1 -hls_playlist 1 \
-master_pl_name master.m3u8 -hls_flags delete_segments \
-hls_delete_threshold 3 -var_stream_map "v:0,a:0 v:1,a:1" '\
/dev/shm/streaming/ptz/media_%v.m3u8'

Note here a different syntax is required for sending the username and password, and the extra filter applied to v:1, i.e. the second video channel, which sets the pixel aspect ratio. The output needs to be 16:9, and 4/3 is the correction factor by which 640/480 needs to be multiplied to reach 16/9.

Serving

Some HLS clients are reluctant to switch to the high-bandwidth versions. In this case, with the high bandwidth resolution being larger than many displays, it is not too surprising that the clients may think it better to try scaling up the low bandwidth source rather than scaling down the high bandwidth one.

So one should note that one can make available the adaptive source, master.m3u8, but also offer direct links to the fixed bandwidth high bandwidth version, media_0.m3u8, and to the fixed bandwidth low bandwidth version, media_1.m3u8.

For this project, the popular video.js client was used.