Strategies for large interactive installations
Large interactive installations are a volatile medium. You may work on two projects back to back and have completely different technology scales to deal with. One installation may direct the whole budget towards 100 HD screens and one computer while the other may direct the budget to a single screen and a farm of computers. Being able to devise different strategies for these two situation is critical to success.
Scale is a particularly difficult to manage and undocumented topic in relation to interactive installations even though it is something we deal with regularly. I’ve put together a list of six ways that you can change your software architecture strategies to better combat the difficulties of scaling. For this article we’re going to use “small systems” as a shorthand for single-machine systems with low output resolutions. Everything beyond that will fall under “large systems” and will include multi-machine systems and high output resolution systems.
1. Settings files
Packaging projects is a practice you get into early on. Packaged projects are easy to move around. “The less files the better!”, you think to yourself. It’s an obvious move to package up the project settings inside your main code files or compile them right into your application. This leaves less room for error when moving projects around or when people might be wandering around the project folder trying to make tweaks. This is not a bad practice in small systems.
Packaged settings can quickly lead to many pain points in large systems. Nothing is worse than wasting time loading and initializing a large system just because you needed to change a small setting before saving the project files again. This compounds quickly once you add more and more systems. Changing a simple setting on an application that takes 5 minutes to load can quickly eat up an hour or more if there are 20 computers that each need a unique settings value.
I heavily recommend that you decouple settings from applications in large systems. Use XML, JSON, or even just plain old CSV files to hold common settings and attributes related to the project. Settings can include everything from asset paths, screen orientations, rendering resolutions, update timings, on-screen text, show lengths, etc. It takes no time to quickly update a JSON file with some settings on it across a large number of computers.
2. Arbitrary self
Large systems often have many moving parts that work together to create the whole. There may a few systems rendering, a few systems dealing with sensor data, and a few systems dealing with displays. It’s common in small systems to make a single all-encompassing project or application that handles all the functionality. This is a positive thing as a large single application is easier to manage than a handful of disparate processes. This same philosophy can be extended to large systems if you add a bit of abstraction.
The settings file becomes even more useful when you start exploring the idea of arbitrary infrastructure. This means that all systems use the same code-base regardless of their intended functionality. This all sounds heady and philosophical but it’s very practical. Take the following situation as an example:
You have 10 computers doing 10 different tasks. You may think about making 10 different project files and managing 10 different code bases where each computer runs it’s own specific code, but you’d probably pull your hair out by the end of the project. It would be better to make a single code base and project file with 10 sub-modules (or similar) that are loaded when you start the application based on the functionality as defined in the settings file. So if the settings file on one computer says “functionality = X”, when the project starts it would only load module X. If the settings file on another computer says “functionality = XYZ”, then the project would execute and load module X, module Y, and module Z.
3. Always leave headroom
This item is self explanatory. You lose quite a bit of system resources to a number of necessary but boring bits of functionality as screen counts and machine counts increase. These include drawing large application windows, network communication, frame synchronization, etc. It can be shocking and surprising how much overhead goes to these elements. It can derail a project if you don’t have foresight and experience of losing processing power to scale. Far too many developers get trapped when something that ran perfectly on a small test system with one monitor isn’t performing as intended on a larger rig once more monitors are plugged in.
Some of these elements can be difficult to simulate and quantify reliably, so even experienced developers can encounter difficulties. As I mentioned in my last post, mock ups and test systems become invaluable in getting to know the specifics of what you’re going to be working on and measuring system performance. Each combination of features and system size will vary the amount of overhead. If you’re unable to get a full test rig, it’s best to greatly overestimate how much processing power you’ll lose. I hesitate to say “assume your application can only use 60% of the available system resources”, but it is a safe way to proceed when you don’t have real metrics to work with.
4. Minimize opportunities for inconsistencies
Consistency in this sense is related to system states. The simplest example of an inconsistency that could occur is a missing asset files in the file system. Updating assets to a large system can be difficult on-the-fly. You may get a new asset on a USB key while working on one computer and another via email while working on a different computer. Coordinating the asset folders to make sure assets are all updated and match across a large number of systems can be a grueling task. Using folder sync services like Dropbox can help, especially since it has LAN sync capabilities.
These opportunities for inconsistencies increase as individual machines in large systems take on more responsibility to process data and create their own assets and data sets. In a situation where there are three computers that need to have an asset generated based on a data set, there are a few options for the workflow:
- Send the same data set to all the computers and have them each render their own version of the asset
- Have one computer render the asset and then send it over the network to the other computers
Neither situation is ideal, but I would lean towards the second option because data sets are often large and communication errors are common. Slight variances in the assets could be developed if one machine drops a frame or two while rendering. These opportunities for inconsistencies add up slowly. If on the other hand the data set is small but the resulting asset is large, the first option may prove to result in less chance on inconsistencies in the systems, as sending a small data set over the network is more reliable than sending a very large asset file.
5. Work on one system
Transferring files between the computers in a large system can be slow. Multiple mice and keyboards take up far too much room and are cumbersome. There are some software solutions that can help when controlling a large number of systems, such as Synergy, but they may not work in many situations. I’ve found that trying to work across multiple machines at once will result in a loss of time and efficiency. Steps should be taken to work on a single system where updates can be pushed out from that computer. Version control tools like git work well with certain types of projects, and with others a simple Dropbox setup can remove the burden of having to repeatedly jump between systems. Further workflow optimizations can be made by creating scripts or batch files that launch your project files.
Working on one system really starts to shine when combined with the previous items above. If all your computers have mirrored drives and run the same code base, it becomes easy to sit at one master control station where you can make updates and push changes to any piece of the system. You don’t want to get in a position where you’re hopping manically between a large number of systems trying to make changes here and there.
6. Prepare a thorough system failure plan
I’ll keep this one short as it is the most boring by far and is extremely situational. You should have a thorough system failure plan in large scale systems. You may have gone your whole career never spending more than a few minutes thinking about this. Usually it amounts to:
- If main machine fails, use video switcher to switch to backup system
- Use backup system while rebooting main system
This may still work in certain scenarios with a larger system. A multi-machine movie player may not need much in the way of initialization or pre-show preparation. A bit of automated switching and some reboots may be all that is required to solve a computer failure for a simple large system. More complex generative content on systems with distributed processing and rendering may need slightly more intricate failure sequences. This is very situational and becomes more about spending the time to think it through to avoid surprises. Some questions you should be able to answer quickly and in detail include:
- What happens if one computer dies and the others don’t?
- Can a rebooted computer pick up where it left off?
- If the application crashes, can it be re-launched without rebooting the whole computer?
- In a system with master and slave computers, does the master system need to be aware of the vitals of slave health? Does the master need to stop sending commands and initiate the reboot process or can all the other computers continue running as intended?
- What could be the most common hardware fault points on each computer?
- What could be the most common hardware fault points in the whole system?
- Is there a single point of failure that could stop the whole system from functioning?
You won’t be able to prevent every single possible thing that could ever happen to the system. This becomes more about knowing ahead of time likely failure scenarios, how to deal with them, and ensuring that you have some automated failure sequence in place.
This list was not exhaustive by any means and I hope it serves more as a starting point for finding new ways to solve problems of scale. Large scale systems have different needs than small scale systems. Sometimes techniques can be adapted to suit both size systems, but having a set of tools you can pull out for large scale systems is invaluable. You will otherwise run into many difficult moments if you try to use the same techniques at all technology scales. A good companion piece to this is the immersive installation checklist.
Immersive Installation Checklist
Prepping for an immersive installation You’ve got your team assembled. Everyone...