DVswitch/ component interaction

About this document

Written June 2009 for DVswitch Version 0.8.2

This is meant for people who are interested in using DVswitch. This is not a substitute for the docs of the individual components, but how they work together. It should be read start to finish to get an understanding of the system, and then use the program docs and tutorials as reference materials.

Introduction

DVswitch is a suite of software that works with dv streams. It was originally designed for the needs of DebConf, and has since been used at other conferences and smaller events such as user groups. Its main feature is the UI that allows real time mixing: selecting the audio from one stream and video from another, switching between streams, and using picture-n-picture to mix 2 video streams.

What is a dv stream? “Digital Video (DV) is a digital video format created by Sony, JVC, Panasonic and other video camera producers, … The DV specification (originally known as the Blue Book, current official name IEC 61834) defines both the codec and the tape format.” (wikipedia.) There are some minor variants such as PAL and NTSC, 12 bit and 16 bit audio, wide screen and 4:3 standard.

Why DV and not one of the other gazillion formats? Mainly because so many cam corders output DV format over firewire, and low latency from the camera and being able to switch without reencoding. The main disadvantage is the huge bitrate: A DV stream has a fixed bandwidth of about 29 Mbit/s or 13 gig per hour. Large harddrives are now cheap enough, but transmitting that much data is fairly impractical over a home Internet connection.

DVswitch is a suite of programs that work together. The main component is the app called dvswitch. It listens for TCP/IP connections on a port, so technically it is a server. Don't let the fact that it also has a GUI confuse you. The clients have no UI, they are clients because they connect to the server. dvswitch clients can either be a source or a sink. Clients and server can both run on the same box, or the connection can be made over a lan. (There are plans to recode such that the sources and sinks will be servers, and the mixer GUI will be a client, so if this documentation seems completely backwards, it is probably out of date.)

The simplest config is dvswitch with one source. However that will just display the video, it won't save it, it won't send it anywhere. In general pretty useless. To do anything useful, you need to at least two sources and a sink. Now you can use the GUI to switch between the streams, and the resulting stream goes somewhere (like saved to disk.)

The components:

GUI

dvswitch Run this first; everything else connects to it, so if it isn't listening, everything else will error and exit. Each time a source connects, a thumbnail is added to the dvswitch GUI. When sinks are added, it prints a message to stdout.

Here is the GUI with 4 sources connected:

Screenshot

Sources:

dvsource-dvgrab Streams output from the dvgrab command to dvswitch.

man dvgrab -- Capture DV or MPEG-2 Transport Stream (HDV) video and audio data from FireWire.

If you are doing video things, becoming familiar with dvgrab is not a bad idea.

If you look at dvsource-dvgrab.c you will see it really is that simple: 300 lines to parse config options, and 3 lines to do the job:

/* Connect to the mixer, set that as stdout, and run dvgrab. */ 
int sock = create_connected_socket(mixer_host, mixer_port); 
if (dup2(sock, STDOUT_FILENO) < 0) 
execvp("dvgrab", dvgrab_argv);

This is what you use to connect a dv device (cam corder or TwinPact100 vga capture) to dvswitch. Whatever the device is outputting will be sent to dvswitch, and you will see it in the thumbnail. Note: the user running this process needs read access to the firewire device, currently /dev/raw1394, which by default is owned by root:root. chown :video, add the user to the video group.

dvsource-file Streams output from a dv file to dvswitch. This is often used to take the place of dead air with a static frame of credits. It is also handy for debugging as it doesn't require any hardware, drivers, or rights. It can also take input from stdin, so it should be possible to use some sort of screen capture software (like vnc) and create a dv stream.

dvsource-alsa Creates a dv stream by reading from the local alsa device and blank video. This is how audio from an alsa device can be used, for instance: a USB audio device that takes line in from a mixer board. It creates a thumbnail on the dvswitch GUI, but the only thing you will do with it is use it for the audio.

Sinks

dvsink-files Connects to dvswitch, waits for the record button to be pressed, then writes the mixed dv stream to disk, making sure the file name is unique so nothing gets over written. You generally want this.

dvsink-command Connects to dvswitch, pipes the mixed stream into a command. Record/Cut buttons have no effect on it. This is what is used for streaming to an ice cast server with ffmpeg2theora|oggforward. This could also be used to pipe into a media player so someone can monitor the resulting stream (including audio, which you don't get by looking at the dvswitch gui)

None of the clients use much CPU, but if you pipe the stream into an encoder, the encoder will need what it needs. The DVSwitch GUI does require at least a 1 Ghz cpu, but the GUI will flicker until about 2Ghz. I have run clients on P2 300 cpu just fine.

Typical configurations

Small - appropriate for a small (around 10 people) user group meeting

base

All of the dvswitch components can be run on the same laptop. It's a small group, so you are going to be close to the presenter which is good because you are using the camera's mic. The trick is to work things out that you get a good picture. It helps if he stays perfectly still, like sitting. Adding an external mic to the camera might be a good idea, then you can place the camera further away and still get good sound.

Given not every one has a TwinPact100, using a camera pointed at the screen might be an acceptable alternative. The software setup doesn't change, it is just different hardware. This is an unproven config, so A) don't assume it will work, and B) if you try it and it does, please let us know.

Medium: added a 2nd camera.

crowdcam

Most presentations have some audience participation. It is really nice to get a shot of both sides of the conversation. You could whip a single camera back and forth, but its much nicer to use two cameras.

Large: added the sound system:

sound_cam

Good sound is really nice. For presentations, good sound is just a matter of isolating the things we want to hear from things we don't. We are just recording someone talking, which doesn't have quite the requirements of singing or musical instruments. Currently the best way to get good sound is with professional gear – given the goal is isolation and not fidelity, it can be low end gear. The mixer is the best UI for being able to quickly adjust to the speaker moving closer/farther from the mic, and people in the audience asking questions. (This is not the same as a PA system which involves amplifiers and loudspeakers. That is a whole different problem that is outside the scope of this document.)

The resulting mixed sound then needs to be made available to dvswitch, which meeds it needs to be the audio track of one of the dv streams. One way of doing this is to plug it into a camera's line in. (Not the same as an external mic, which is what most low end cam corders will have a jack for. It is not recommended that you put line level into a mic jack, but it has been known to work if you adjust things appropriately.)

The TwinPact100 has line in but there is a problem: If there is no video input, it stops streaming dv, and so you loose your sound source (really bad.) There may be a way to work around this, like telling the presenter not to let his machine's screen saver drop the video signal, or switching the TwinPact100 to an alternate video source, or getting the manufacture to come out with a new version. So far nothing has turned out to be reliable, but exploring all the options has not been exhausted either.

Another solution is to use an alsa device that accepts line level (it seems hit and miss weather a laptop will have line or mic level inputs – same applies as the camera.)

sound_alsa

A note about clipping: Clipping is the term for when the analog signal is louder than the receiving equipment is meant to handle and the curvy sign wave gets a flat spot when it slams into the limit. The result is crackling that is very unpleasant to listen to (bad), and can make it hard to understand what the person is saying (very bad.) If the audio level meter on dvswitch is hitting it's max, it is most likely causing a problem with the A2D hardware and you need to drop the level at the sound board. It might be possible to fix it after the fact, but pretty much significant data has been lost, so best to fix it correctly.

Regardless of what sources you connect to dvswitch, the result is the same: dvsource will stream the mixed dv do whatever sinks are connected. Remember, these are network connections, so as long as your network has the bandwidth, you can run the client's on other machines.

Saving to disk is the one obvious sink (assuming you have the disk space.) What you do with the .dv files (like encode and upload to blip.tv) is outside the scope of this documentation.

The other option is spewing the dv stream to some other command. Technically, those commands are also outside the scope, but here are some examples:

Local Monitor:

monitor

Pipe the stream into a player that can decode dv so someone can watch and verify the rest of the system is working and no one has fallen asleep. be careful running this on the same box as DVswitch, the CPU demands are pretty high. test before the event or you will likely find yourself trying to debug/reconfigure while you are trying to record, which is really bad.

Other uses: drive a projector when the presenter's laptop can't (happened at PyCon09.) Or drive a display and loudspeakers in another room.

Live streaming over the internets!

streaming

This is getting way outside scope... but here is the command I use to send a stream to http://giss.tv:

dvsink-command -- ffmpeg2theora - -f dv -F 25:5 --speedlevel 0 -v 4 -a 0 -c 1 -H 9600 -o - | oggfwd giss.tv 8001 $STREAMPW /CarlFK.ogg

The important part of this is the bandwidth and CPU requirements: the connection from dvswitch to dvsink-command is a single dv stream, just like all the others. ffmpeg2theora is going to need a lot of CPU. Exactly how much depends on the stream you create. For anything with text I prefer to keep the full resolution and drop the fps, even if it starts getting choppy. The above command maxes out a P4-1.7ghz. Appropriate settings very much depend on what cpu you have, how much bandwidth you have available, and how much bandwidth you expect the people on the other side of the internets to have.

By setting up pipes and feeding a single dv stream to tee you can run 2 instances of ffmpeg2theora, one for a high bitrate stream and the other for low. Currently ffmpeg2theora is single treaded, so this makes good use of a dual core cpu. you can also connect 2 instances of dvsink-command to DVswitch, bu that will take up another 30mbit of bandwidth (adds up quick on 100mbit, gig has no problem)

The diagram implies that icecast is local to the DVswitch lan. The blue lines represent low bandwidht streams (each one is whatever you told ffmpeg2theora to create.) So unless the venue has lots of upstream bandwidth, the icecast box should be in the cloud.

See http://icecast.org for everything you need to know about it. The docs and community are great.

Conclusion

You should now know enough to experiment with the system. You should experiment before you try to work with an event you care about. I recommend having an event specifically for recording so that the schedule and expectations is determined by you, the director.