Mongrel2
The Language Agnostic Web Server
Not logged in
Why donate?

Mongrel2 Manual
Installing, Deploying, Managing, Hacking

Zed A. Shaw
Guillermo O. “Tordek” Freschi

July 2010

Contents

Preface
1 Introduction
 1.1 Language Agnostic
 1.2 Asynchronous
 1.3 Message Protocol
 1.4 Application Oriented
 1.5 Automated Management
 1.6 Using This Manual
2 Installing
 2.1 Install Dependencies
 2.2 Building Mongrel2
  2.2.1 Using Fossil
  2.2.2 Using the .zip File
 2.3 Building And Installing
 2.4 Testing The Installation
 2.5 Upgrading from trunk
 2.6 Up Next
3 Managing
 3.1 Model-View-Controller
 3.2 Trying m2sh
  3.2.1 What The Hell Just Happened?
 3.3 A Simple Configuration File
 3.4 How A Config Is Structured
  3.4.1 Server
  3.4.2 Host
  3.4.3 Route
  3.4.4 Dir
  3.4.5 Proxy
  3.4.6 Handler
  3.4.7 Others
 3.5 A More Complex Example
 3.6 Routing And Host Patterns
 3.7 Deployment Logs And Commits
 3.8 Control Port
 3.9 Multiple Servers
 3.10 Tweakable Expert Settings
4 Deploying
 4.1 Mongrel2 Deployment Requirements
  4.1.1 Introducing procer
  4.1.2 Installing procer
 4.2 The Plan
 4.3 Step 1: The Deployment Area
 4.4 Step 2: The mongrel2.org Configuration
 4.5 Step 3: Setup procer
  4.5.1 The Python Examples
  4.5.2 Testing The New Setup
  4.5.3 Nice Features of Procer
 4.6 Step 4: Static Content
 4.7 Step 5: Testing And Troubleshooting
 4.8 Further Improvements
 4.9 Deployment Tips
5 Hacking
 5.1 Front-end Goodies
  5.1.1 HTTP
  5.1.2 Proxying
  5.1.3 WebSockets
  5.1.4 JSSocket
  5.1.5 Long Poll
  5.1.6 Streaming
  5.1.7 N:M Responses
  5.1.8 Async Uploads
 5.2 Introduction to ZeroMQ
  5.2.1 A Quick Python ZeroMQ Example
 5.3 Handler ZeroMQ Format
  5.3.1 Socket Types Used
  5.3.2 UUID Addressing
  5.3.3 Numbers Identify Listeners
  5.3.4 Paths Identify Targets
  5.3.5 Request Headers And Body
  5.3.6 Complete Message Examples
  5.3.7 Python Handler API
 5.4 Basic Handler Demo
 5.5 Async File Upload Demo
 5.6 MP3 Streaming Demo
 5.7 Chat Demo
 5.8 Other Language APIs
 5.9 Writing Your Own m2sh
6 Contributing

Preface

This manual will tell you about the most awesome webserver on the planet: Mongrel2. It is written for people with a sense of humor who want to get things done with Mongrel2. That means, if you're an operations professional, software developer, hacker or just curious, it's for you. However, if you're too serious and think “flowery language” (A.K.A. good, entertaining writing) does not belong in your software manuals, then you should just go read the source code and save everyone a huge headache dealing with you.

In case you haven't figured it out, this book will be fun and slightly obnoxious. That's not intended to insult you, but just to keep you interested so that you want to read it.

Typography

Usually the people running the web can be divided into three types of people: Steves, Edsgers, and Knuths.

The Steves think that the entire internet should be a wonderful user experience where all pages are crafted with pixel-perfect fonts with high gloss visuals and coated with the most happy happy joy joy of all possible experiences. To them, design is paramount and actual stability isn't important unless it interferes with design. The Steves of the internet think the Edsgers of internet are destroying the universe with things like “functionality”, “security”, and “stability”. Just like the real Steve Jobs, they would rather everything look fantastic and then use awesome marketing to cover up any technical flaws.

The Edsgers feel that the internet is completely unsafe, and until it is a fully curated and crafted set of academic, peer reviewed papers, it will be a festering pile of dung. To the Edsgers, the world is dangerous and only a truly paranoid attitude toward security and stability will ensure that it becomes safe. They want every single piece of software to reject all reality and be crafted from nothing but pure mathematics, and hate the fact that the Steves want to run around painting the world with useless frivolous colors and words and things that lead to ambiguity and happiness.

The typography in this book, and the entire project, is for the Knuths of the world. I like to think of the Knuths as the practical yet professional types with a light sense of humor. They are the ones who are getting things done while still balancing between great typography and solid bug-free functionality. They aren't zealots, but practical, straight-forward type of people.

That is why this book is written in TEX, and why it uses whatever fonts TEX uses.

Chapter 1
Introduction

Mongrel2 is a web server. HTTP requests come in, HTTP responses go out. Request, response. There is nothing revolutionary or extravagant in what Mongrel2 does with a browser, apart from supporting fancy asynchronous socket protocols. To the browser, Mongrel2 is just this nice web server that has WebSockets and Flash Sockets in it. That's it.

What makes Mongrel2 special is how it satisfies these requests in a language agnostic and asynchronous way using a simple messaging protocol to talk to applications; not just serve files. Mongrel2 is also designed to be incredibly easy to automatically manage it as part of your infrastructure.

Other web servers do some of these things, but they either do them in a bastardized way or not all of them at once. Plenty of language specific web servers like Node.js and Jetty have asynchronous operation, but they're not language agnostic1 . Other web servers will let you talk to any language as a backend, but they insist on using HTTP proxying or FastCGI, which is not friendly to asynchronous operations.

Mongrel2 is the only web server I know of that actively tries to focus on these features as a cohesive whole.


Note 1: TL;DR!

Don't want to read the manual?2 You can read the GettingStarted page available many languages even. It's a fast crash course in getting Mongrel2 up and running.


1.1 Language Agnostic

The term “language agnostic” came from people who read about Mongrel2 in the early days, and it means that Mongrel2 does not try to promote any one language over any others. Mongrel2 does not care if you run a “Python shop”, or if you're a die hard PHP fan, or if you hate PHP and love only Ruby on Rails. Mongrel2 only knows about HTTP requests, HTTP responses, async messages, and getting them to your gear to meet those requirements.

Language Agnosticism is the most important feature of Mongrel2, and its entire purpose stems from the desire to reduce the amount of programming language religion in the world. Real people want to get things done, not wanker on which technology is the best or force other people to use their favorite toys. Instead, Mongrel2 works to just be great for every language and make it easy to use what works best for a given problem.

1.2 Asynchronous

Many web servers are “asynchronous” internally, and some force you to know way too much about how they work internally to get anything done. What makes Mongrel2's version of asynchronous messaging different is that it extends to outside the Mongrel2 server. This is a powerful concept that even your backends can operate asynchronously using simple identification of connected clients.

Other servers assume that every request is received by a browser, then sent to a backend, and then directly sent out to the client and that's it. Mongrel2 assumes that there is a connected client, and it sends requests to backends, but it makes no assumptions about how those backends respond to the clients. All it requires is that the backend application send messages addressed to the client and it will write them on the socket.

Because of this design, Mongrel2 can easily house both classic HTTP clients, keep-alive style HTTP client, chunked encoding responses, JSSockets, or WebSockets using the same code.

1.3 Message Protocol

In order to properly do asynchronous messaging in a language agnostic way, Mongrel2 needed a good base protocol that allowed for different messaging styles and worked with many different languages. HTTP proxying already does this, although it's not asynchronous at all. What gives Mongrel2 its special powers is ZeroMQ, a language- and transport-mechanism-agnostic messaging system that does not require a centralized messaging server to operate.

Using ZeroMQ lets Mongrel2 talk to a huge number of languages, operate within any kind of network architecture, and do it with a very simple communication model and API that most programmers can understand.

1.4 Application Oriented

Web servers today are written as if it was still 1995 and all anyone needs to do is serve files, maybe some graphics. Today's web applications are not about serving files; they're about serving application logic and doing it asynchronously. The advent of the bewildering numbers of ways to hack HTTP into an async messaging protocol3 is proof enough that the pressure is on for web servers to be for applications with highly interactive interfaces.

Mongrel2 can still serve files just fine. In fact, it's got very accurate and easy-to-understand file serving code. However, Mongrel2 will always be about applications. Fast, scalable, awesome, asynchronous or synchronous applications that need to use languages that mere mortals can work with, like PHP. If there's ever a choice, apps win.

1.5 Automated Management

The language agnostic philosophy even extends to the configuration system where you can use any language you need to configure it and manage it, as long as the results are a SQLite3 database Mongrel2 can read and work with to run. There are great tools for managing this database already written in Python, but if you hate Python then you can write anything you want.

This pattern is established with servers like Postfix, Exim, Sendmail, qmail, and others, that convert configuration files to half-assed SQL databases. Mongrel2 effectively adopts a Model-View-Controller design for its configuration system, the same way every web application is designed today. The Model is a SQLite3 database file, which any programming language can access. The Controller is a Mongrel2 process that reads this file and sets itself up accordingly.

The View is a Python script called m2sh that gives you a command line “UI” to configure and setup the Mongrel2 sqlite model. It gives you commands for managing it, crafting configurations, looking at them; the works.

But, most importantly, you can write your own. You don't have to wait for a Mongrel2 developer to craft a configuration file parser for your favorite language, or use some hack job Nagios Perl junk to automate or scan it. It's SQLite3 with a solid, simple schema and well written Python code showing you how it works.

Nothing stops you from automating the hell out of Mongrel2 with that.

1.6 Using This Manual

This manual is intended to be fun to read, so probably the best way to use it is to actually read it.

I know! Revolutionary, right? I mean, who has time to read and learn about something these days. You just want to get in there and get whatever problem you have done, now! No time for words. You just want a straight dump right into your brain so that you are able to solve all your problems instantly and screw all this talking.

You ever ask yourself if this attitude about not wanting to read and learn is possibly the reason you always get stuck in emergencies with no time to read and learn?

Something to think about.

My recommendation is that you go through every page of this manual and do the stuff in it. Even if you think you won't need something, because you're not a programmer, or you're not in operations, you should learn it. Doing so will make the parts you do need clearer and give you better ideas for later.

Chapter 2
Installing

Mongrel2 is designed to build on most modern Unix systems, specifically Linux and Mac OSX. It is written in C (not Ruby) and uses fairly vanilla C and standard libraries, except for one piece that implements the internal coroutines. Other than this, you should be able to compile and install Mongrel2 with nothing more than make all install after you've installed all the dependencies.

Now, if when I said dependencies you started to groan at having to install software to use my software, well my friend, welcome to the future. You said you don't want people reinventing the wheel, right? Great, that means you need to install software for my software to work. It's either that or wait 10 years for me to build everything from scratch like some arrogant jackass.

We good now? Great, let's get started.

2.1 Install Dependencies

To get everything working you will need the following dependencies:

If you install these things in this order, then everything should be good. Typically people run into problems if they do this out of order; for example, trying to install Mongrel2 before they have distribute and pip installed.

Since every system is different, it is difficult to tell you exactly how to install required packages for your OS, but here's how I did it on my computer:


Note 2: Use Virtualenv If You Fear Pip

For those who still want to use the package management but want to use pip and distribute, I suggest you check out virtualenv. What virtualenv does is create a little “chroot install” of just the packages you need using pip or distribute. When you don't want them anymore, you can just delete one directory and you're fine.



Source 1: Installing Dependencies on ArchLinux
# install ZeroMQ 
> wget http://www.zeromq.org/local--files/area:download/zeromq-2.0.8.tar.gz 
> tar -xzvf zeromq-2.0.8.tar.gz 
> cd zeromq-2.0.8/ 
> ./configure 
> make 
> sudo make install 
 
# install python 
> sudo pacman -S python 
 
# install sqlite3 
> sudo pacman -S sqlite3 
 
# install distribute and pip 
> sudo pacman -S python-pip 
> sudo pip install distribute 
 
# install PyZMQ from github 
> sudo pacman -S git 
> git clone http://github.com/zeromq/pyzmq.git 
> cd pyzmq 
> sudo python setup.py install 
 
# web.py 
> sudo pip install web.py

If you run into parts that your OS is missing, which is likely on Debian and SuSE systems, then you'll have to go and figure out how to install it.

2.2 Building Mongrel2

If everything went well you should be able to grab the Mongrel2 source and try building it. There's two ways you can get the source code to Mongrel2:

  1. Install Fossil SCM and check out the source.
  2. Grab the source .zip release and install it from there.

2.2.1 Using Fossil

Mongrel2 is still fairly new, and new software has bugs. Hell, old software has bugs. If you want to get the latest and greatest, then use fossil to grab the source and stay with updates:

  1. Go to the fossil download page and grab the binary that fits your system, or the source tar.gz.
  2. Follow their install instructions to get it on your system.
  3. If you find that your version of fossil ends up having problems with mongrel2.org let us know since we might need to upgrade.

Once you have fossil you can then get the Mongrel2 source and open it up:


Source 2: Cloning the Mongrel2 Source
> mkdir ~/fossils 
> fossil clone http://mongrel2.org:44445 ~/fossils/mongrel2.fossil 
> mkdir mongrel2 
> cd mongrel2 
> fossil open ~/fossils/mongrel2.fossil

Make sure you do this in order (just like with every set of instructions you follow) or else you'll get errors. For example, if you don't make the ~/fossils directory then you'll get an error saying fossil can't open your file to clone. Well, that's because you didn't make the ~/fossils directory. Just pay attention and don't blame fossil if you can't.

2.2.2 Using the .zip File

If you don't want to install fossil then you will need to get the .zip file from our site and install it that way. Since the code is new, you have no guarantees that this will work at all or that you will get a stable server. We have not made an official release, so using the .zip file is effectively a lame way of avoiding using fossil.

  1. Login as an anonymous user at mongrel2.org.
  2. Go to the latest source tip.
  3. Click on the “ZIP archive” link to download the latest zip file of the source.
  4. Unzip this file and cd into the resulting directory.

2.3 Building And Installing

Once you have the source ready to go you can build it and then install it with one command: make all install

There is no ./configure for Mongrel2 since we avoid too many OS specific differences or shield those away with good feature checks in the code.

The end result of this should be:

  1. Mongrel2 builds and compiles without errors.
  2. All the unit tests run.1
  3. The Python libraries that m2sh needs get installed.
  4. The m2sh itself gets installed.
  5. Finally, the mongrel2 binary gets installed.

If any of these stages fail, then you can simply try to fix them and then run: make clean all install which will do everything all over again.

The most common error is if you didn't install pip or distribute and insist on using your own system's package management. Do not report bugs about your own flavor of Linux's stupidity about versions and how it installs Python. If you don't want to use the package system every other Python programmer uses, then you're on your own.

2.4 Testing The Installation

When you are done, you probably want to make sure that it installed correctly. There's a test configuration file in tests/config.sqlite that you can use to try it out:


Source 3: First Test Run
> mkdir run 
> mkdir logs 
> mkdir tmp 
> m2sh dump -db tests/config.sqlite 
> m2sh start -db tests/config.sqlite -host localhost

That's it. Just hit CTRL-c for now and we'll get into playing with this setup later.

2.5 Upgrading from trunk


Source 4: Update your checkout
> cd mongrel2 
> fossil pull http://mongrel2.org 
> fossil update

Don't forget make clean all install.

2.6 Up Next

You now should have a working Mongrel2 system installed and the m2sh configuration interface ready to go. In the rest of this manual we'll be simply learning how to do more with Mongrel2, like making our own configs, writing handlers, and other fun stuff.

Chapter 3
Managing

Mongrel2 is designed to be easy to deploy and automate the deployment. This is why it uses SQLite to store the configuration, but m2sh as an interface to creating the configuration. Doing this lets you access the configuration using any language that works for you, augment it, alter it, migrate it and automate it.

In this chapter, I'm going to show you how to make a basic configuration using m2sh and all the commands that are available. You'll learn how the configuration system is structured so that you know what goes where, but in the end it's just a simple storage mechanism.


Note 3: Apparently SQL Inspires FUD

When I first started talking about Mongrel2, I said I'd store the configuration in SQLite and do a Model-View-Controller kind of design. Immediately people who can't read flipped out and thought this meant they'd be back in “Windows registry hell”, but with SQL as their only way to access it. They thought that they'd be stuck writing configurations with SQL; that SQL couldn't possibly configure a web server.

They were wrong on many levels. Nobody was ever going to make anyone use SQL. That was repeated over and over but, again, people don't read and love spreading FUD. The SQLite config database is nothing like the Windows Registry. No other web server really uses a true hierarchy; they just cram a relational model into a weirdo configuration format. The real goal was to make a web server that was easy to manage from any language, and then give people a nice tool to get their job done without having to ever touch SQL. EVER!

In the end, what we got despite all this fear mongering is a bad ass configuration tool and a design that is simple, elegant, and works fantastic. If you read that Mongrel2 uses SQLite and though this was weird, well, welcome to the future. Sometimes it's weird out here (even though Postfix has been doing this for a decade or more).


3.1 Model-View-Controller

When you hear Model-View-Controller, you think about web applications. This is a design pattern where you place different concerns into different parts of your system and try not to mix them too much. For an interactive application, if you keep the part that stores data (Model) separated from the logic (Controller) and use another piece to display and interact with the user (View), then it's easier to change the system and adapt it over time to new features.

The power of MVC is simply that these things really are separate orthogonal pieces that get ugly if they're mixed together. There's no math or theory that says why; just lots of experience has told us it's usually a bad idea. When you start mixing them, you find out that it's hard to change for new requirements later, because you've sprinkled logic all over your web pages. Or you can't update your database because there's all these stored procedures that assume the tables are a certain way.

Mongrel2 needed a way to allow you to use various languages and tools to automate its configuration. Letting you automate your deployments is the entire point of the server. The idea was that if we gave you the Controller and the Model, then you can craft any View you wanted, and there's no better Model than a SQL database like SQLite: it's embeddable, easily accessed from C or any language, portable, small, fast enough and full of all the features you need and then some.

What you are doing when you use m2sh to configure a configuration for Mongrel2, is working with a View we've given you to create a Model for the Mongrel2 server to work with. That's it, and you can create your own View if you want. It could be automated deployment scripts, a web interface, monitoring scripts, anything you need.

The point is, if you just want to get Mongrel2 up and running, then use m2sh. If you want to do more advanced stuff, then get into the configuration database schema and see what you can do. The structure of the database very closely matches Mongrel2's internal structure, so understanding that means you understand how Mongrel2 works. This is a vast improvement over other web servers like Apache where you've got no idea why one stanza has to go in a particular place, or why information has to be duplicated.

With Mongrel2, it's all right there.

3.2 Trying m2sh

To give this configuration system a try you just need to run the test configuration used in the unit tests. Let's try doing a few of the most basic commands with this configuration:


Source 5: Sample m2sh Commands
> m2sh dump -db tests/config.sqlite 
> m2sh servers -db tests/config.sqlite 
> m2sh hosts -db tests/config.sqlite -host localhost 
> m2sh running -db tests/config.sqlite -host localhost 
> m2sh start -db tests/config.sqlite -host localhost

At this point you should have seen some raw dumps of the database, lists of servers and hosts, seen that mongrel2 is not running, and then started it. You can find out about all the commands and get help for them with m2sh help or ms2h help -for command.

You can now try doing some simple starting, stopping and reloading using sudo (make sure you CTRL-c to exit from the previous start command):


Source 6: Starting, Stopping, Reloading
> m2sh start -db tests/config.sqlite -host localhost -sudo 
> tail logs/error.log 
> m2sh reload -db tests/config.sqlite -host localhost 
> tail logs/error.log 
> curl http://localhost:6767/ 
> tail logs/error.log 
> m2sh running -db tests/config.sqlite -host localhost 
> m2sh stop -db tests/config.sqlite -host localhost

Awesome, right? Using just this one little Python management script, you are able to completely manage a Mongrel2 instance without having to hack on a config file at all. But you probably need to know how this is all working anyway.

3.2.1 What The Hell Just Happened?

You now have done nearly everything you can to a configuration, but you might not know exactly what's going on. Here's an explanation of what's going on behind the scenes:

  1. When you did m2sh start with the -sudo option, it actually runs sudo mongrel2 tests/config.sqlite localhost to start the server.
  2. Mongrel2 is now running in the background as a daemon process, just like a regular server. However, what it did was chroot to the current directory and then drop privileges so that they match the owner of that directory (you). Use ps aux to take a look.
  3. With Mongrel2 running, you can look in the logs/error.log file to see what it said. It should be a bunch of debug logging, but check out the messages: nice and detailed.
  4. Next you did a soft reload with m2sh reload and you should notice that your mongrel2 process was able to load the new config without restarting.
  5. However, there's a slight bug that doesn't do the reload until the next request is served. That's what the curl http://localhost:6767/ was for.
  6. Now that you can see this reload work in logs/error.log, you used m2sh running to see if it's running. This command is just reading the config database to find out where the PID file is (run/mongrel2.pid) and then checking if that process is running.
  7. Finally, you tell mongrel2 to stop, and since it dropped privileges to be owned by you, you can do that without having to use sudo.

All of this is happening by reading the tests/config.sqlite file and not reading any configuration files. You can now try building your own configuration that matches this one or some others.

3.3 A Simple Configuration File

To configure a new config database you'll write a Python file that looks a lot like a configuration file. The advantage of using Python is that you can put actual real logic in your configuration file in order to make your configuration smarter. A great application of this is when your configuration has to change depending on the server it is deployed on, but you want to check the configuration file into a revision control system. With Python you just write your configuration like it's code and you're set.

The first thing you need to do is initialize a fresh configuration database to get started by using m2sh init and then you load your configuration into it using m2sh load. For our example, we'll use the example configuration from examples/python/tests/sample_conf.py to make a simple one:


Source 7: Simple Little Config Example
 
from mongrel2.config import  
 
 
main = Server( 
    uuid="f400bf85-4538-4f7a-8908-67e313d515c2", 
    access_log="/logs/access.log", 
    error_log="/logs/error.log", 
    chroot="./", 
    default_host="localhost", 
    name="test", 
    pid_file="/run/mongrel2.pid", 
    port=6767, 
    hosts = [ 
        Host(name="localhost", routes={ 
            r'/tests/': Dir(base='tests/', index_file='index.html', 
                             default_ctype='text/plain') 
        }) 
    ] 
) 
 
commit([main])

If you aren't familiar with Python then this code might look freaky, but it's really simple. We'll get into how its structured in a second, but to load this file we would just do this:


Source 8: Loading The Simple Config
> m2sh init -db simple.sqlite 
> m2sh load -db simple.sqlite -config examples/python/tests/sample_conf.py 
> m2sh servers -db simple.sqlite 
> m2sh hosts -db simple.sqlite -host localhost 
> m2sh start -db simple.sqlite -host localhost

With this sequence of commands you:

  1. Create a raw fresh config database name simple.sqlite
  2. Load the sample_conf.py into it.
  3. List the servers it has configured.
  4. List the hosts that server has, with what routes it has.
  5. Start this server to try it out.

By now you should be getting the hang of the pattern here, which is to use m2sh and little Python configuration “script” to generate .sqlite files that Mongrel2 understands.

3.4 How A Config Is Structured

The base structure of a Mongrel2 configuration is:

Server
This is the root of a config, and you can have multiples of these in one database, even though each start command only runs one at a time.
Host
Servers have Hosts inside them, which say what DNS hostname Mongrel2 should answer for. You can have multiples of these in each Server.
Route
Hosts have Routes in them, which tells Mongrel2 what to do with URL paths and patterns that match them. Routes then have Dir, Handler or Proxy items in them.
Dir
A Dir serves files out of a directory, full with 304 and ETag support, default content types, and most of the things you need to serve them.
Proxy
A Proxy takes requests matching the Route they're attached to and sends them to another HTTP server somewhere else. Mongrel2 will then act as a full proxy and also try to keep connections open in keep-alive mode if the browser supports it.
Handler
A Handler is the best part of Mongrel2. It takes HTTP requests, and turns them into nicely packed and processed ZeroMQ messages for your asynchronous handlers.

Each of these nested “objects” then has a set of attributes you can use to configure them, and most of them have reasonable defaults.

3.4.1 Server

The server is all about telling Mongrel2 where to listen on its port, where to chroot, and general server specific deployment gear.

uuid
A UUID is used to make sure that each deployed server is unique in your infrastructure. You could easily use any string that's letters, numbers, or - characters.
chroot
This is the directory that Mongrel2 should chroot to and drop privileges.
access_log
The access log file relative to the chroot. Usually starts with a `/'. Make sure you don't configure your server so that this and other files are accessible, or make this owned by root.
error_log
The error log file just like access_log.
pid_file
Like the access log, where within the chroot directory is the pid file stored.
default_host
The server has a bunch of hosts listed, but it needs to know what the default host is. This is also used as a convenient way to refer to this Server.
port
The port the server should listen on for new connections.

3.4.2 Host

A host is matched using a kind of inverse route that matches the ending of Host: headers against a pattern. You'll see how this works when we talk about routes, but for now you just need to know that request to the Server.port are routed based on these Host configurations the Server contains.

name
The name that you use to talk about this Host in the server configuration.
matching
This is a pattern that's used to match incoming Host headers for routing purposes.
server
If you want to set the server separately you can use this attribute.
maintenance
This will a setting for the future that will let you have Mongrel2 throw up a maintenance page for this host.
routes
This is a dict (hashmap) of the URL patterns mapped to the targets that should be run.

3.4.3 Route

The Route is the workhorse of the whole system. It uses some very fancy but still simple code in Mongrel2 to translate Host: headers to Hosts and URL paths to Handlers, Dirs, and Proxies.

path
This is path pattern that matches a route. The pattern uses the Mongrel2 pattern langauge which is a reduced version of the Lua pattern matching system.
reversed
Determines if this pattern is reversed, which is useful for matching file extensions, hostnames, and other naming systems where the ending is really the prefix. Usually you don't set this.
host
You can use this attribute to set the host manually.
target
This is the target that should handle the request, either a Dir, Handler or Proxy.

Later on, you'll learn about the pattern matching that's used, but it's basically a stripped down version of your normal regular expressions, but with a few convenient syntaxes for doing simple string matching. When you configure a route, you write something like /images/(.⋆.jpg) and the part before the `(' is used as a fast matched prefix, while the part after it is considered a pattern to match. When a request comes in, Mongrel2 quickly finds the longest prefix that matches the URL, and then tests its pattern if there is one. If the pattern is valid, the request goes through. If not, 404.

3.4.4 Dir

A Dir is a simple directory-serving route target that serves files out of a directory. It has caching built-in, handles if-modified-since, ETags, and all the various bizarre HTTP caching mechanisms as RFC-accurate as possible. It also has default content-types and index files.

base
This is the base directory from the chroot that is served. Files should not be served outside of this base directory even in the chroot.
index_file
This is the default index file to use if a request doesn't give one. The Dir also will do redirects if a request for a directory doesn't end in a / slash.
default_ctype
The default Content-Type to use if none matches the MIMEType table.

Currently, we don't offer more parameters for configuration, but eventually you'll be able to tweak more and more of the settings to control how Dirs work.

3.4.5 Proxy

A Proxy is a Ghetto. It is used so that you can use Mongrel2 but not have to throw out your existing infrastructure. Mongrel2 goes to great pains to make sure that it implements a fast and dead accurate proxy system internally, but no matter how good it is it, can't compete with ZeroMQ handlers. The idea with giving Proxy functionality is you can point Mongrel2 at existing servers, and then slowly carve out pieces that will work as handlers.

addr
The DNS address of the server.
port
The port to connect to.

Requests that match a Proxy route are still parsed by Mongrel2's incredibly accurate HTTP parser, so that your backend servers should not be receiving badly formatted HTTP requests. Responses from a Proxy server, however, are sent unaltered to the browser directly.

3.4.6 Handler

Now we get to the best part: the ZeroMQ Handlers that will receive asynchronous requests from Mongrel2. You need to use the ZeroMQ syntax for configuring them, but this means with one configuration format you can use handlers that are using UDP, TCP, Unix, or PGM transports. Most testing has been done with TCP transports.

send_spec
This is the 0MQ sender specification, something like “tcp://127.0.0.1:9999” will use TCP to connect to a server on 127.0.0.1 at port 9999. The type of socket used is a PUSH socket, so that handlers receive messages in round-robin style.
send_ident
This is an identifier (usually a UUID) that will be used to register the send socket. This makes it so that messages are persisted between crashes.
recv_spec
Same as the send spec, but it's for receiving responses from Handlers. The type of socket used is a SUB socket, so that a cluster of Mongrel2 servers will receive handler responses but only the one with the right recv_ident will process it.
recv_ident
This is another UUID if you want the receive socket to subscribe to its messages. Handlers properly mention the send_ident on all returned messages, so you should either set this to nothing and don't subscribe, or set it to the same as send_ident.

The interesting thing about the Handler configuration is you don't have to say where the actual backend handlers live. Did you notice you aren't declaring large clusters of proxies, proxy selection methods, or anything else, other than two 0MQ endpoints and some identifiers? This is because Mongrel2 is binding these sockets and listening. Mongrel2 doesn't actively connect to backends; they connect to Mongrel2. This means, if you want to fire up 10 more handlers, you just start them; no need to restart or reconfigure Mongrel2 to make them active.

3.4.7 Others

There's also Log, MIMEType, and Setting objects/tables you can work with, but we'll get into those later since you don't need to know about them to understand the Mongrel2 structure.

3.5 A More Complex Example

All of this knowledge about the Mongrel2 configuration structure can now be used to take a look at a more complex example. We'll take a look at this example and I'll just say what's going on, and you try to match what I'm saying to the code. Here's the file examples/python/tests/mongrel2_org.py:


Source 9: Mongrel2.org Config Script
 
from mongrel2.config import  
 
main = Server( 
    uuid="2f62bd5-9e59-49cd-993c-3b6013c28f05", 
    access_log="/logs/access.log", 
    error_log="/logs/error.log", 
    chroot="./", 
    pid_file="/run/mongrel2.pid", 
    default_host="mongrel2.org", 
    name="main", 
    port=6767 
) 
 
 
test_directory = Dir(base='tests/', 
                     index_file='index.html', 
                     default_ctype='text/plain') 
 
web_app_proxy = Proxy(addr='127.0.0.1', port=80) 
 
chat_demo_dir = Dir(base='examples/chat/static/', 
                    index_file='index.html', 
                    default_ctype='text/plain') 
 
chat_demo = Handler(send_spec='tcp://127.0.0.1:9999', 
                    send_ident='54c6755b-9628-40a4-9a2d-cc82a816345e', 
                    recv_spec='tcp://127.0.0.1:9998', recv_ident='') 
 
handler_test = Handler(send_spec='tcp://127.0.0.1:9997', 
                       send_ident='34f9ceee-cd52-4b7f-b197-88bf2f0ec378', 
                       recv_spec='tcp://127.0.0.1:9996', recv_ident='') 
 
# the r'' string syntax means to not interpret any \ chars, for regexes 
mongrel2 = Host(name="mongrel2.org", routes={ 
    r'@chat': chat_demo, 
    r'/handlertest': handler_test, 
    r'/chat/': web_app_proxy, 
    r'/': web_app_proxy, 
    r'/tests/': test_directory, 
    r'/testsmulti/(.⋆.json)': test_directory, 
    r'/chatdemo/': chat_demo_dir, 
    r'/static/': chat_demo_dir, 
    r'/mp3stream': Handler( 
        send_spec='tcp://127.0.0.1:9995', 
        send_ident='53f9f1d1-1116-4751-b6ff-4fbe3e43d142', 
        recv_spec='tcp://127.0.0.1:9994', recv_ident='') 
}) 
 
main.hosts.add(mongrel2) 
 
settings = {"zeromq.threads": 1} 
 
commit([main], settings=settings)

If you haven't guessed yet, this configuration is what's used on http://mongrel2.org to configure the main test system. In it we've got the following things to check out:

  1. Our basic server, with a default host of mongrel2.org.
  2. The route targets are separated out into their own variables, unlike the sample_conf.py file where they're just tossed into one big structure.
  3. First target is a Dir that serves up files out of the tests directory and uses index.html as its default file.
  4. Next we setup a Proxy pointing at the main website's server for testing the proxy.
  5. Then there's a Dir target for the http://mongrel2.org:6767/chatdemo/ that we'll look at later.
  6. And you have the Handler for the same chat demo that does the actual logic of a chat system.
  7. After that's a little Handler for testing out doing HTTP requests to a handler. Notice how even though the chat demo and this handler use different protocols (chat demo is using JSSockets) you don't have tell mongrel2 that? It figures it out based on how they're being used rather than by configurations.
  8. With all those handler targets, we can now make the mongrel2 Host with all the routes assigned once, nice and clean. However, look how I was lazy and just tossed the mp3stream demo right into the routes dict? You can totally do this and m2sh will figure it out. Remember also that you can use the r'blah' string format to not have to double up on your \ chars in the patterns.
  9. We then assign this mongrel2 variable as the hosts for the main server.
  10. There is also a settings feature, which is just a dict of global settings you can tweak. In this case, we're upping the number of threads that 0MQ is using for its operations.
  11. Finally, we commit the whole thing to the database by passing in the servers to save and the settings to use.

And that, my friends, is the most complex configuration we have so far.

3.6 Routing And Host Patterns

The pattern code was taken from Lua and is some of the simplest code for doing fast pattern matches. It is very much like regular expressions, except it removes a lot of features you don't need for routes. Also, unlike regular expressions, URL patterns always match from the start. Mongrel2 uses them by breaking routes up into a prefix and pattern part. It then uses routes to find the longest matching prefix and then tests the pattern. If the pattern matches, then the route works. If the route doesn't have a pattern, then it's assumed to match, and you're done.

The only caveat is you have to wrap your pattern parts in parenthesis, but these don't mean anything other than to delimit where a pattern starts. So instead of /images/.⋆.jpg, write /images/(.⋆.jpg) for it to work.

Here's the list of characters you can use in your patterns:

Using the uppercase version of an escaped character makes it work the opposite way (i.e., \A matches any character that isn't a letter). The backslash can be used to escape the following character, disabling its special abilities (i.e., \\ will match a backslash).

Anything that's not listed here is matched literally. Remember also that when you write the Python code for your scripts you can use the r'blah' string syntax to avoid having to double up on your \ characters in the string.


Note 4: Sorry, Unicodians, It's All ASCII

Yep, I get it. You think that everyone should use UTF-8 or some Unicode encoding for everything. You despise the dominance of the `A' in ASCII and hate that you can't put your spoken language right in a URL.

Well, I hate to say it, but tough. Protocols are hard enough without having to worry about the bewildering mess that is Unicode. When you sit down to write a network protocol, the last thing you need is a format that's inconsistent, has multiple interpretations, can't be properly capitalized or lowercased, and requires extra translations steps for every operation. With ASCII, every computer just knows what it is, and it's the fastest for creating wire protocol formats.

This is why, on the Internet, you have to do things to URLs to make them ASCII, like encoding them with % signs. It's in the standard, and it's the smart thing to do. I don't want to have to know the difference between the various accents in your spoken language to route a URL around. I just want to deal with a fixed set of characters and be done with it. Don't blame me or Mongrel2 for this, it's just the way the standard is and the way to get a server that is stable and works.

Protocols work better when there's less politics in their design. This means you can't put Unicode into your URL patterns. I mean, you can try; but the behavior is completely undefined.


Here are some example routes you can try to get a feel for the system:

That should give the idea of how you can use them. Notice also that I'm using the Python r"blah" string syntax which is interchangeable with the r'blah' syntax so I don't have to double escape everything.

3.7 Deployment Logs And Commits

A very nice feature for people doing operations work is that m2sh keeps track of all the commands you run on it while you work, and lets you add little commit logs to the log for documentation later. These commit logs are then maintained even across m2sh init and m2sh load commands so you can see what's going on. They track who did something, what server they did it on, what time they did it and what they did.

To see the logs for your own tests, just do m2sh log -db simple.sqlite and then, if you want to add a commit log message, you use the m2sh commit command. Here's an example from mongrel2.org:


Source 10: Example Commit Log
> m2sh log -db config.sqlite 
[2010-07-18T04:14:53, mongrel2@zedshaw, init_command] /usr/bin/m2sh init -db config.sqlite 
[2010-07-18T04:15:06, mongrel2@zedshaw, load_command] /usr/bin/m2sh load -db config.sqlite -config examples/python/tests/mongrel2_org.py 
[2010-07-18T04:22:06, mongrel2@zedshaw, load_command] /usr/bin/m2sh load -db config.sqlite -config examples/python/tests/mongrel2_org.py 
[2010-07-18T04:23:32, mongrel2@zedshaw, load_command] /usr/bin/m2sh load -db config.sqlite -config examples/python/tests/mongrel2_org.py 
[2010-07-18T04:26:16, mongrel2@zedshaw, upgrade] Latest code for Mongrel2. 
[2010-07-18T18:05:59, mongrel2@zedshaw, load_command] /usr/bin/m2sh load -db config.sqlite -config examples/python/tests/mongrel2_org.py 
[2010-07-18T20:09:01, mongrel2@zedshaw, init_command] /usr/bin/m2sh config -db config.sqlite -config examples/python/tests/mongrel2_org.py 
[2010-07-18T20:09:02, mongrel2@zedshaw, load_command] /usr/bin/m2sh config -db config.sqlite -config examples/python/tests/mongrel2_org.py 
> m2sh commit -db config.sqlite -what mongrel2.org -why "Testing things out."

The motivation for this feature is the trend that operations store server configurations in revision control systems like git or etckeeper. This works great for holding the configuration files, but it doesn't tell you what happened on each server. In many cases, the configuration files also need to be reworked or altered for each deployment. With the m2sh log and commit system, you can augment your revision control with deployment action tracking.

Later versions of Mongrel2 will keep small amounts of statistics which will link these actions to changes in Mongrel2 behavior like frequent crashing, failures, slowness, or other problems.

Basically, there's nowhere to hide. Mongrel2 will help operations figure out who needs to get fired the next time Twitter goes down.

3.8 Control Port

Just before the release of 1.0, we added a feature called the “Control Port”, which lets you connect to a running Mongrel2 server over a unix (domain) socket and give it control commands. These commands let you get the status of running tasks, lists of currently connected sockets and how long they've been connected, the server's current time and kill a connection. Using this control port, you can then implement any monitoring and timeout policies you want, and provide better status.

By default, the control port is in your chroot at run/control, but you can set the control_port setting to change this. You can actually change it to any ZeroMQ valid spec you want, although you're advised to use IPC for security.

Once Mongrel2 starts, you can then use m2sh to connect to Mongrel2 and control it using the simple command language. Currently, what you get back is very raw, but it will improve as we work on the control port and what it does.

The list of commands you can issue are:

status tasks
Dumps a JSON formatted dict (object) of all the currently running tasks and what they're doing. Think of it like an internal ps command.
status net
Dumps a JSON dict that matches connections IDs (same ones your handlers get) to the seconds since their last ping. In the case of an HTTP connection this is how long they've been connected. In the case of a JSON socket this is the last time a ping message was received.
time
Prints the unix time the server thinks it's using. Useful for synching.
kill ID
Does a forced close on the socket that is at this ID from the status net command. This is a rather violent way to kill a connection so don't do it that often, but if you're overloaded then this is where to go.
control stop
Shuts down the control port permanently in case you want to keep it from being accessed for some reason.

You then use the control port by running m2sh:

  m2sh control -db config.sqlite -name test
  CONNECTING...
  > status net
  {"total": 0}
  > time
  {"time": 1282980306}

As we work on this feature more, we'll have a nicer interface to it. If you wanted to connect on your own, here is a simple Python script showing how to do it:


Source 11: Python Control Port Example
import zmq 
 
CTX = zmq.Context() 
 
addr = "ipc://run/control" 
 
ctl = CTX.socket(zmq.REQ) 
 
print "CONNECTING" 
ctl.connect(addr) 
 
while True: 
    cmd = raw_input("> ") 
 
    ctl.send(cmd) 
 
    print ctl.recv() 
 
ctl.close()

You obviously don't need to do this, but should you want to do something special like a management interface, this is your start.

3.9 Multiple Servers

A Mongrel2 process itself does not have any support for running multiple servers; instead, it takes two simple parameters: a sqlite config database and a server uuid that names the server to be launched. This is done to keep the mongrel2 code simple and workable.

However.

Mongrel2's m2sh does support launching multiple servers from a single configuration database. By passing -every to many m2sh commands, you are able to perform actions on all configured servers at once. You can also perform actions on single servers by specifying their uuid, name or host. If any parameter given is ambiguous (that is if, for example, you search with -host localhost and your config contains two servers which attempt to bind to localhost), m2sh will list the matching servers and ask you to clarify your selection.

For example:

  > m2sh start -db config.sqlite -every
  Launching server localhost 9f0cbd7d-aeff-4195-921e-2ce1c25512d3 on port 6768
  ...
  Launching server localhost 3d815ade-9081-4c36-94dc-77a9b060b021 on port 6767
  ...
  
  > m2sh start -db config.sqlite -host localhost
  Not sure which server to run, what I found:
  NAME HOST UUID
  --------------
  localhost localhost 3d815ade-9081-4c36-94dc-77a9b060b021
  localhost localhost 9f0cbd7d-aeff-4195-921e-2ce1c25512d3
  ⋆ Use -every to run them all.
  
  > m2sh start -db config.sqlite -uuid 3d815ade-9081-4c36-94dc-77a9b060b021
  Launching server localhost 3d815ade-9081-4c36-94dc-77a9b060b021 on port 6767
  ...
  
  > m2sh running -db config.sqlite -every
  Found server localhost 3d815ade-9081-4c36-94dc-77a9b060b021 RUNNING at PID 28525
  PID file run/mongrel2.pid not found for server localhost 9f0cbd7d-aeff-4195-921e-2ce1c25512d3
  
  > m2sh stop -db config.sqlite -every

3.10 Tweakable Expert Settings

Many of Mongrel2's internal settings are configurable using the settings system. Some of these are dangerous to mess with, so make sure you test any changes before you try to run them. Setting them to 0 or negative numbers isn't checked, so if you make a setting and things go crazy, you need to not make that setting. All of these have good defaults so you can leave them alone unless you need to change them.

To configure your settings, you just pass them to the commit command when you setup your configuration. Like this:


Source 12: Changing Settings
settings = {"zeromq.threads": 1, "limits.url_path": 1024} 
 
commit([main], settings=settings)

Mongrel2 will read these on the fly and write INFO log messages telling you what the settings are so you can debug them if they cause problems. The list of available settings are:

control_port=ipc://run/control
This is where Mongrel2 will listen with 0MQ for control messages. You should use ipc:// for the spec so that only a local user with file access can get at it.
limits.buffer_size=2 * 1024
Internal IO buffers, used for things like proxying and handling requests. This is a very conservative setting, so if you get HTTP headers greater than this, you'll want to increase this setting. You'll also want to shoot whoever is sending you those requests, because the average is 400-600 bytes.
limits.connection_stack_size=32 * 1024
Size of the stack used for connection coroutines. If you're trying to cram a ton of connections into very little RAM, see how low this can go.
limits.content_length=20 * 1024
Maximum allowed content length on submitted requests. This is, right now, a hard limit so requests that go over it are rejected. Later versions of Mongrel2 will use an upload mechanism that will allow any size upload.
limits.dir_max_path=256
Max path length you can set for Dir handlers.
limits.dir_send_buffer=16 * 1024
Maximum buffer used for file sending when we need to use one.
limits.fdtask_stack=100 * 1024
Stack frame size for the main IO reactor task. There's only one, so set it high if you can, but it could possibly go lower.
limits.handler_stack=100 * 1024
The stack frame size for any Handler tasks. You probably want this high, since there's not many of these, but adjust and see what your system can handle.
limits.handler_targets=128
The maximum number of connection IDs a message from a Handler may target. It's not smart to set this really high.
limits.header_count=128 * 10
Maximum number of allowed headers from a client connection.
limits.host_name=256
Maximum hostname for Host specifiers and other DNS related settings.
limits.mime_ext_len=128
Maximum length of MIME type extensions.
limits.url_path=256
Max URL paths. Does not include query string, just path.
superpoll.hot_dividend=4
Ratio of the total (like 1/4th, 1/8th) that should be in the hot selection. Set this higher if you have lots of idle connections; set it lower if you have more active connections.
superpoll.max_fd=10 * 1024
Maximum possible open files. Do not set this above 64 * 1024, and expect it to take a bit while Mongrel2 sets up constant structures.
upload.temp_store=None
This is not set by default. If you want large requests to reach your handlers, then set this to a directory they can access, and make sure they can handle it. Read about it in the Hacking section under Uploads. The file has to end in XXXXXX chars to work (read man mkstemp).
zeromq.threads=1
Number of 0MQ IO threads to run. Careful, we've experienced thread bugs in 0MQ sometimes with high numbers of these.

Chapter 4
Deploying

I am now going to try to get you to setup a small, tiny, little version of a good deployment that matches the configuration of the site at http://mongrel2.org, with all the examples running. This configuration will give you all the tools you need to make automated and managed deployments, but it is using small scale tools. The idea is that you learn what is involved in a nice, easy-to-manage setup, using simple things first, then you can extrapolate that out into your own setup or something better.

4.1 Mongrel2 Deployment Requirements

It may seem obvious, but I'll go over the things you need in order to continue on in this section:

Mongrel2
I know, hard to believe, but you actually need to have Mongrel2 installed.
m2sh
Again, not sure why, but some folks think they don't need this. Unless you've written your own, you need m2sh.
Python
Obviously, if you have m2sh, then you have Python, but some systems (like Debian) don't install all of Python. Make sure your Python setup is good.
root
You'll need root access on your box. Either through sudo or some other means.
Basic Python coding
Right now, you should be able to do some basic Python.

That will get you going at first and, as we go, we'll do various other setups to get our application working.


Note 5: Learning Python

Why should you learn programming? The trend is that if you are a system administrator who can't code, you are on your way out. Eventually, you'll be in charge of automating systems; not manually managing them, and if you don't believe me then what do you think all those managed service companies are doing? Alright, so you need to learn to code, but most of the books suck for really learning if you know nothing.

This is why I started my own book: Learn Python The Hard Way, for people who know nothing about programming but need or want to learn. It teaches Python, but it mostly teaches all the things programmers actually learn before they learn programming. When you're done with my book, you'll have your “programming brown belt”. That means you can then move onto one of many other free online books and really learn programming, and have a higher chance of actually learning it.

If you can't code Python then you can probably muddle through this and you may learn something, but learning Python will be important later.

But don't read “Dive Into Python”. It is a horrible introduction.


4.1.1 Introducing procer

When I started working on this little manual, I wanted to get you into setting up a well-managed and automated deployment system. The m2sh program does much of the automation you need, but Mongrel2 also has to talk to quite a few separate little pieces that run as separate processes. Trying to juggle all these processes without a tool to help is a nightmare. You end up writing init scripts and merging them into your boot process and all sorts of crazy antics just so you can run a stupid hello world demo.

What I needed was a “user space process manager”. These are programs that run other programs, but, more importantly, try to keep those other programs running without much human intervention. When you need to deploy a ton of processes that all have to be running, these USPMs are fantastic. They usually read some startup profile describing what needs to start and what they depend on, and then it kicks everything into gear and watches them. If any of the processes crash, they try to restart them. Very simple.

There's just one catch: all of them suck. There's daemontools, which barely builds (if at all) and then assumes that daemons don't fork. Stupid. There's minit, which bafflingly required dietlibc to even compile and assumed it was going to be the one true init (not user space at all). There's cinit, which got through a compile, then barfed on its documentation, and the end result is some huge number of weird shell scripts to make it work, and, again, it wants to be the one true init. Finally , runit is some of the worst C code I've seen in years and has the same weird design as daemontools.

After trying every single one, I just gave up. Either they didn't build, were too complex, expected to be the one true init, poorly documented, not maintained, and definitely not going to work for this manual. My only choice was to shave a yak and write my own.

The end result is procer, which lives in examples/procer and does most of what you need in a USPM. It works a lot like daemontools or minit, but is much simpler, with these differences:

  1. It is much simpler, with only a single command to start all your stuff and keep it running.
  2. It will build anywhere Mongrel2 builds, because it reuses the libm2.a library from the Mongrel2 project.
  3. It doesn't want to be the one true init, or even expect to be running constantly. You can start it and stop it and it will only run what's not already running.
  4. It assumes that programs will always daemonize and create a PID file. This turns out to be way easier to manage than what daemontools does, so I'm sort of baffled why daemontools is how it is.
  5. It has dependency management so that you can have processes start only after others have finished.
  6. It still uses simple files to configure itself that are in separate directories.
  7. It can be run as root and, like Mongrel2, it will drop privileges to the owner of the profile directory before it runs the command. This is incredibly useful because it lets you setup scripts that run as other users without much configuration or fuss.
  8. It is dinky, tiny and well written so you can understand it, even though it's written in C.
  9. Best of all, I can use it in this book and you won't go insane trying to install it or use it like the others.

Of course, if you have something else you like then, please, use it. Anything that automates process management will be your friend. In this manual, to keep things simple and easily understood, I'll be using procer to tell you how to setup everything.


Note 6: Alternatives to procer

I wrote procer mostly for this book, but I also use it for my Mongrel2 deployments. It works for me but you can try other solutions. By default, Mongrel2 will work with either daemontools/runit style, or init.d style launchers. If Mongrel2 runs as a regular user, it assumes that you want runit style (don't fork, write to stdout/stderr). If you run as root, it assumes you want init.d style like what procer uses (fork, drop priv, chroot, etc.).

You should check out proclaunch as another alternative that is similar to procer, and inspired by procer, but written in Perl with a few more features.

Either way, Mongrel2 is practical, and does generally the right thing with today's tools. Want to use daemontools? Fine, just run it mongrel2 config.sqlite server_uuid and it'll work right. Want to put it in init.d or use procer or similar? Fine, run it as root.


4.1.2 Installing procer

Installing procer is very easy. It's a single little binary and it lives in examples/procer in the Mongrel2 source. Here's how you'd install it totally from scratch as if you hadn't even build Mongrel2 yet:


Source 13: Install procer
> cd projects/mongrel2 
> make clean all install 
> cd examples/procer 
> make clean all install

That's the entire install process, and now procer is in /usr/local/bin so you can use it. In the rest of this chapter you'll learn how to use procer by just setting up the Mongrel2 demo completely and messing around with it.

4.2 The Plan

We need to plan this deployment to make sure we get the end result correct:

  1. Create a deployment area where everything will live.
  2. Create a config.sqlite that will work with the demos in examples.
  3. Setup procer to run Mongrel2 and the three demo Python scripts for chat, handlertest, and mp3stream, and have it run the fake backend web.py project so we have something to proxy to.
  4. Get all the static file content working.
  5. Test out that procer is keeping things running and play with taking things down and up and using m2sh to work with the deployment.

Once you have this setup working, you can then start to make your own deployments and tweak things as you need for your own applications. Remember that the goal is to get you to automate everything as much as possible, so you can go further than this then do it.

4.3 Step 1: The Deployment Area

We'll need a place to put all this stuff and run it so that Mongrel2 can chroot there, procer knows where its profiles are, and its all nice and clean. For these instructions, we're just going to make some directories in your home directory, but feel free to change this up later if you find a better way.


Source 14: Make Deployment Directories
# go home first 
> cd ~/ 
 
# create the deployment dir 
> mkdir deployment 
> cd deployment/ 
 
# fill it with the directories we need 
deployment > mkdir run tmp logs static profiles 
 
# create the procer profile dirs for each thing 
deployment > cd profiles/ 
deployment/profiles > mkdir chat mp3stream handlertest web mongrel2 
deployment/profiles > cd .. 
 
# setup the mongrel2 database initially 
deployment > m2sh init -db config.sqlite 
 
# copy the mongrel2_conf.py sample from the source to here 
deployment > cp ~/mongrel2/examples/python/tests/mongrel2_org.py config.py 
 
# see our end results 
deployment > ls 
config.py  config.sqlite  logs  profiles  run  static tmp

Hopefully, you're starting to see how you could easily automate this so that you don't have to do this all the time. I'm just showing you how to “make the sausage” so that you know where everything goes. Future versions of m2sh will most likely create deployment directories like this automatically.

What we've done here is the following:

  1. Setup a ~/deployment directory we'll put everything in.
  2. Created run, tmp, logs, and profiles that Mongrel2 and procer need to run.
  3. In profiles we started dirs for chat, mp3stream, handlertest, web and mongrel2, that procer will read files out of to get all our gear up and running.
  4. Initialized the config.sqlite file we'll be filling in with our modififed config.py.
  5. Copied the mongrel2_org.py example file over to our deployment so we can modify it.

4.4 Step 2: The mongrel2.org Configuration

Now we're ready to get the configuration working. Here's the thing, though: you should try to alter the configuration yourself. I've already given you the file and you are going to have to make the changes to meet the requirements for this deployment directory. Here's what you have to change in config.py to make everything work right:

  1. Get rid of the test_directory handler, since we won't need it, and any routes that mention it.
  2. Change the base of chat_demo_dir to 'static/chatdemo/', which we'll setup at the end.
  3. Modify the server chroot so that it's /home/YOU/deployment/.
  4. Use the m2sh uuid command to make some new UUIDs for all the existing ones. This is optional, but probably a good idea to get in the habit now.
  5. Change the port for web_app_proxy so it points to 8080 instead of 80.
  6. Finally, change any mention of “mongrel2.org” into “localhost” so that you can run it locally.

Once you have that all edited, you should be able to run m2sh load -db config.sqlite -config config.py and it'll just load it up. Try using m2sh servers and m2sh hosts to take a peek.

To test it out at this stage you can just run the config.sqlite that you did with these commands:


Source 15: Testing The Initial Configuration
> m2sh start -db config.sqlite -host localhost 
^C 
> m2sh start -db config.sqlite -host localhost -sudo 
> less logs/error.log 
> m2sh stop -db config.sqlite -host localhost -murder

That's enough to make sure it runs, but you've got nothing running, so it mostly won't work at all. Just start up and then kill it right after.

4.5 Step 3: Setup procer

Now we want to make procer start everything for us and keep it running. How procer works is you put a few special files into a directory in profiles. This directory (say chat) is the profile for that app. When you start procer, you point it at the main profiles directory and it tries to run it. It's dead simple and very easy to automate, so we'll do it by hand and then you can do some automation later.

Let's first setup a basic config that gets our skeleton profiles and make sure procer can run everything:


Source 16: Skeleton procer Setup
deployment > cd profiles/ 
deployment/profiles > ls 
chat  handlertest  mongrel2  mp3stream  web 
 
# make all the restart settings 
deployment/profiles > for i in ⋆; do touch $i/restart; done 
 
# make all the empty dependencies 
deployment/profiles > for i in ⋆; do touch $i/depends; done 
 
# setup the pid_files to some sort of default 
deployment/profiles > for i in ⋆; do echo $PWD/$i/$i.pid > $i/pid_file; done 
deployment/profiles > cat chat/pid_file 
 
# get the run script setup to do nothing 
deployment/profiles > for i in ⋆; do echo '#!/bin/sh' > $i/run; done 
deployment/profiles > for i in ⋆; do chmod u+x $i/run; done 
 
# check out what we did 
deployment/profiles > ls -lR

With all of that, you can then try to run procer to watch it fail but still try to run everything:

> sudo procer $PWD $PWD/../run/procer.pid 
> less error.log

This is assuming that you are still in the profiles directory. You should see the file error.log get created and probably some messages printed to the screen. Just ignore any mention of Mongrel2 since that's probably just cruft from the libm2.a we haven't removed.

Take a look in the error.log and you'll see it's not necessarily errors but information on how things were run. You should see something like this for each profile:


Source 17: First Dummy Run Of procer
DEBUG procer.c:232: Loading 5 actions. 
DEBUG procer.c:83: STARTED chat 
ERROR Failed to open PID file /home/zedshaw/deployment/profiles/chat/chat.pid for reading. 
ERROR Failed to open PID file /home/zedshaw/deployment/profiles/chat/chat.pid for reading. 
INFO  No previous Mongrel2 running, continuing on. 
DEBUG procer.c:37: ACTION: command=/home/zedshaw/deployment/profiles/chat/run, pid_file=/home/zedshaw/deployment/profiles/chat/chat.pid, restart=1, depends=(null) 
DEBUG procer.c:56: WAITING FOR CHILD. 
INFO  Now running as UID:1000, GID:1000 
DEBUG procer.c:60: Command ran and exited successfully, now looking for the PID file. 
ERROR chat didn't make pidfile /home/zedshaw/deployment/profiles/chat/chat.pid.

I've cleaned this up a bit and, again, ignore that it's saying “Mongrel2”; that's just cruft from the library since it was originally designed for Mongrel2. What you can see here is the following:

  1. It starts up and says it found 5 profiles.
  2. It starts chat, and says there's no PID file so it's good to continue.
  3. It reports what ACTION it's running, so you can see the config.
  4. It spawns off your run script, drops privilege and says it's WAITING for your script to exit.
  5. After your script runs, it looks for the PID file you gave in pid_file and, if it's not there, it exits that action.
  6. It does this for all of them and, since none of them run right, procer exits.

Next up, let's get Mongrel2 running inside procer:


Source 18: procer Config For Mongrel2
> cd ~/deployment 
# make mongrel2 run as root 
deployment > sudo chown root.root profiles/mongrel2 
 
# tell procer where mongrel2 puts its pid_file 
# notice the > not >> on this 
deployment > echo "$PWD/run/mongrel2.pid" > profiles/mongrel2/pid_file 
 
# make the run script start mongrel2 (notice the >> on this) 
deployment > echo "cd $PWD" >> profiles/mongrel2/run 
deployment > echo "m2sh start -db config.sqlite -host localhost" >> profiles/mongrel2/run 
 
# check out the results 
deployment > cat profiles/mongrel2/run 
#!/bin/sh 
cd /home/YOU/deployment 
m2sh start -db config.sqlite -host localhost

Obviously, you don't have to use a series of echo commands to make these scripts. You can edit them just fine, we're just doing it this way so that you can follow along easier.

Now, make sure you don't have any other Mongrel2 processes running, and then start procer again to see if it starts this configuration correctly.


Source 19: Using procer To Run Mongrel2
> cd ~/deployment 
# clear out the error.log for testing 
deployment > rm profiles/error.log 
 
# start procer 
deployment > sudo procer $PWD/profiles $PWD/procer.pid 
 
# see if procer is running 
deployment > ps ax | grep procer 
17934 ?        Ss     0:00 procer /home/zedshaw/deployment/profiles /home/zedshaw/deployment/procer.pid 
 
# see if mongrel2 is running 
deployment > ps ax | grep mongrel2 
17944 ?        Ssl    0:00 mongrel2 config.sqlite ba0019c0-9140-4f82-80ca-0f4f2e81def7

To watch procer in action, try doing m2sh stop -db config.sqlite -host localhost -murder and then look at profiles/error.log and watch Mongrel2 come right back.

4.5.1 The Python Examples

We've got a good setup of procer going and it keeps Mongrel2 running, so let's setup a similar thing for each of our little Python demos that we'll need. In order to do this, though, we sort of have to “hack in” making them daemonize and create PID files with a little shell script help. Let's start with the chat demo and, assuming your mongrel2 source is in ~/projects/mongrel2, you will change profiles/chat/run to be like this:


Source 20: Run Script For Chat Demo
#!/bin/sh 
set -e 
 
DEPLOY=/home/YOU/deployment 
SOURCE=/home/YOU/projects/mongrel2 
 
cd $SOURCE/examples/chat 
nohup python -u chat.py 2>&1 > chat.log & 
echo $! > $DEPLOY/profiles/chat/chat.pid

This little script uses some funky features you might not be familiar with, but which are nice to learn, so let's take a look:

  1. The first trick is set -e, which tells bash to bail if there's any errors in your script. This is a huge life saver in system scripts.
  2. Next, you point some variables at where the deployment and Mongrel2 source live, remembering to not type YOU but your username.
  3. After that, you run the chat.py using a program called nohup. This basically daemonizes your script by redirecting output and preventing the program from exiting, and then you background it with &.
  4. The final thing we do is echo the magic variable $! (the PID of the last process started in the background) to the chat.pid file in the profile directory.

When you run this manually, you should see something like this:

deployment > ./profiles/chat/run 
nohup: redirecting stderr to stdout 
 
deployment > ps ax | grep chat 
19305 pts/1    Sl     0:00 python chat.py 
 
deployment > kill -TERM 19305

After all that, you can then try out procer again to see if it properly runs the chat demo as well as mongrel2:


Source 21: Running procer With Chat Demo
# run procer to get stuff started 
deployment > sudo procer $PWD/profiles $PWD/run/procer.pid 
 
# see if it's all running 
deployment > ps ax | grep procer 
19607 ?        Ss     0:00 procer /home/zedshaw/deployment/profiles /home/zedshaw/deployment/run/procer.pid 
 
deployment > ps ax | grep mongrel2 
19621 ?        Ssl    0:00 mongrel2 config.sqlite ba0019c0-9140-4f82-80ca-0f4f2e81def7 
 
deployment > ps ax | grep chat 
19609 ?        Sl     0:00 python chat.py 
 
# try killing chat to see if it comes back 
deployment > kill -TERM `cat profiles/chat/chat.pid` 
 
deployment > ps ax | grep chat 
19669 ?        Sl     0:00 python chat.py

If you go look at profiles/error.log, you'll see that procer is also running each of them as the right user, with chat being run as you, but Mongrel2 being run as root so it can chroot/drop privileges properly.

Rather than give you a walk through each of these setups, here's the run scripts for the remaining files:


Source 22: Remaining Run Scripts
profiles/handlertest/run_________________________________________
#!/bin/sh 
set -e 
 
DEPLOY=/home/YOU/deployment 
SOURCE=/home/YOU/projects/mongrel2 
 
cd $SOURCE/examples/http_0mq 
nohup python -u http.py 2>&1 > http.log & 
echo $! > $DEPLOY/profiles/handlertest/handlertest.pid

profiles/mp3stream/run__________________________________________

#!/bin/sh 
set -e 
 
DEPLOY=/home/YOU/deployment 
SOURCE=/home/YOU/projects/mongrel2 
 
cd $SOURCE/examples/mp3stream 
nohup python -u handler.py 2>&1 > mp3stream.log & 
echo $! > $DEPLOY/profiles/mp3stream/mp3stream.pid

profiles/web/run________________________________________________

#!/bin/sh 
set -e 
 
DEPLOY=/home/YOU/deployment 
SOURCE=/home/YOU/projects/mongrel2 
 
cd $SOURCE/examples/chat 
nohup python -u www.py 2>&1 > www.log & 
echo $! > $DEPLOY/profiles/web/web.pid

4.5.2 Testing The New Setup

Once everything is running and procer is maintaining it, you just need to see if things work. Here's some curl commands to try:


Source 23: Testing With Curl
> curl http://localhost:6767/ 
Hello, World! 
> curl http://localhost:6767/handlertest 
...

4.5.3 Nice Features of Procer

There's some nice subtle features you get from using procer to run your stuff:

Faster Development
A great thing about procer is once you get all of this setup, it cuts down on a lot of your setup time and development time because it will properly restart things for you. This means you can simply make changes to code or configs, and then just kill the process and procer will kick it back over automatically.
Easy Automation
You should start to see how you could automate creating profiles for new processes since the setup is consistent.
profiles/run.log
All your commands will have their output sent to this file so you can see how they might be blowing up in your scripts.
Restart State Maintained
Since procer is just tracking PID files and processes, if you shut it down, it won't kill the world. When you start it back up, it just starts new stuff or stuff it needs, then goes back to supervising. This means you can change the configs for procer then just kick it over and it'll do the right thing.

The key thing, though, is that you now have the whole application for the mongrel2.org demo up and running, including automated process management, configuration, and managing everything.

4.6 Step 4: Static Content

The final thing we have to do is get the static content we need to try out the chat demo:


Source 24: Setting Up Static Content
> cp -r ~/projects/mongrel2/examples/chat/static static/chatdemo 
> m2sh stop -db config.sqlite -host localhost -murder 
> curl -I http://localhost:6767/chatdemo/

If you get a good response then you should be able to go to http://localhost:6767/chatdemo/ and the chat should work. Notice also that you just killed mongrel2 with m2sh and it came back because of procer. If you do your curl check too fast, you might miss it, so just wait a bit.

4.7 Step 5: Testing And Troubleshooting

You should have been testing the configuration as you went, but the main things to test are:

  1. The /chatdemo/ works and you can send messages. Try a few different browsers.
  2. You can get a simple message from the /handlertest/ and that's about it.
  3. See if you can get the mp3streamer to stream some mp3s. Put a few in its directory, then kill it so procer brings it back. Then, point mplayer at http://localhost:6767/mp3stream and it should work.
  4. Check that you can make the proxy go to the web.py app you start in the chat demo's directory.
  5. See if you can stop things and have procer bring them back.
  6. Stop procer and then start it again to see if it properly doesn't step on things.

If you run into problems, make sure that you can run each little piece and that the files you were supposed to make are correct. The best tool to use is diff.

4.8 Further Improvements

That ends this chapter, and at this point you should know how to setup nearly everything Mongrel2 has to offer right now. You should have a good idea of how procer will work or not for your real deployments, and how it's used by me for my own deployments.

A major improvement that we may eventually make is automating setup of procer profiles, and just better overall management of the profiles with m2sh. If you feel like hacking on that, just go ahead and try and let us know.

Other than that, automate, automate, automate.

4.9 Deployment Tips

Mongrel2 enforces the correct behavior when you run as root, which is to drop priv and chroot. This makes the server more secure, and it also simplifies your deployments. Since everything you do always runs in a chroot, you now just need to rsync that chroot directory, or put it into a git or hg repo, and you're set. You're literally forced to make your deployments portable to different directories and systems.

As of the 1.0 push for Mongrel2, we haven't done much work on how you deploy all the different languages. They sort of sprung up during development and our plan is to expand that out in the 2.0 version so that deployment is very well documented for all the different languages we support. That means you'll probably run into some snags and things we didn't anticipate.

The following are some general points we've come up with while deploying our own apps, with more to come as we work on the 2.0 version:

  1. Don't run things as root if you can. It's bad habit that everyone tries to do their sysadmin completely as root, so Mongrel2 is designed to be run very easily as under a regular user account. The only time you really should be running as root is when you do a quick -sudo to m2sh to start mongrel2 up so it can chroot.
  2. Use the chroot to keep your deployment simpler. I literally do all my work locally and then just rsync my changes up to my remote staging server. Everything has to live in the chroot anyway, and the chroot enforces that it is completely self-contained.
  3. Use Python's virtualenv or anything similar to get yourself a totally local environment. Too many systems, such as OSX, have very outdated packages and will change versions on you without telling you. The best way to make sure your software keeps working (and works as one cohesive deployment) is to use a virtualenv inside your chroot. It should even work cross-platform if you don't have compiled packages in there.
  4. Create a user for your application and live in there. I don't have any root access on my stuff. Everything is run as a user named after the website, and is deployed right in the /home/USER directory. I login as that user, manage as that user, and I don't give them sudo access. For the times when I need to sudo to restart or run Mongrel2, I then use a separate login that I have open (with screen) and do it there. This reduces your risk of hacks, but also just simplifies things. It's no problem for me to move my configuration over to new machines with this setup, or deploy clusters. I know that as long as there's the right user on the target, I'm set.
  5. Use GNU screen or die.
  6. Keep your config.sqlite and the .py file in your chroot, and keep your content and everything else under that. This makes sure that the config isn't accessible outside your content directories. Mongrel2 helps you get this right by not allowing certain Dir configurations that would expose your chroot to the world.

There's a few additional tips for people who want to use alternative process supervision like daemontools, runit, or init.d setups. No matter what you use, you should probably follow this advice:

  1. Whatever you use for process management, make sure it can run stuff as not root and can do chroot for you. If you're running your Mongrel2 as root, you're doing it wrong. Actually, if you're running any services as root that don't absolutely need to be, you're doing it wrong.
  2. Mongrel2 is happy to run as a regular user, and assumes that if you do not run as root, then you probably want to run under daemontools or similar. It won't chroot or drop priv and logs to stdout/stderr.
  3. If you need to bind to port 80 but run under daemontools as a regular user, then use privbind to do it. This tool will run any command, like mongrel2 but it does it in a way that lets the executable grab ports below 1024. This restriction on ports is actually really stupid so don't worry about doing this.
  4. Make sure your process monitor is not a single point of failure. Some of them out there will take your whole world down if they crash. Try doing a harsh kill on your process manager and see how it behaves. As much as they like to tell you not to worry about this because they ”run forever”, everything has bugs and stupid people tend to kill things they don't get. If taking one process down nukes your whole server, then that's a bad design.

As we work on the next phase of Mongrel2 development, this will improve, so watch for news about deployment and real applications.

Chapter 5
Hacking

This chapter is all about making cool things with Mongrel2. It covers all the non-deployment features that you get from the browser's side and the handler/backend side of your application. I'll show you how the chat demo works for the async web sockets. I'll get into writing your own handlers using a few other demos. I'll cover some of the interesting things you can do with Mongrel2 you can't do with other servers. Finally, I'll get into practical things, when to do proxying and when to use a 0MQ handler.

For the majority of this chapter, I'll be using Python, but the demos should translate to the other languages that are implemented. I'll periodically show how another language does one of the demos, so you can get the idea that Mongrel2 is language agnostic. In no way should you take me using Python in this chapter to mean you can't use something else for your handlers.

Currently supported languages are:

Python
When you installed the m2sh gear, you also got a mongrel2 Python library.
Ruby
Probably the most extensively supported language, with good Rack support, by perplexes on github.
C++
C++ support by akrennmair on github.
PHP
PHP support by winks on github.
C
You can also write handlers in C using the Mongrel2 library, but it's really rough, and not recommended yet. A C library will come, though.
Others?
ZeroMQ supports Ada, Basic, C, C++, Common Lisp, Erlang, Go, Haskell, Java, Lua, .NET, Objective-C, ooc, Perl, PHP, Python, and Ruby, so after reading this chapter you can easily write handlers in any of those languages too.

However, no matter how many languages Mongrel2 supports, you will still have applications that can't fit into 0MQ handlers and just work better as classic web apps, either because you've already written them and have existing infrastructure, or because of some architectural issues that require it to run traditionally. Because of that, Mongrel2 supports HTTP proxying, which allows you to route requests to basic web server backends that don't support 0MQ.


Note 7: What About FastCGI/AJP/CGI/SCGI/WSGI/Rack?

Nothing prevents you from writing your own connector between Mongrel2 and your deployment protocol of choice. If you need to run FastCGI or AJP in your environment, then your best bet is to just make a handler that translates Mongrel2 requests to the protocol you need and back. The Mongrel2 format is very easy to parse and translate, so you should be able to do it with no problem. The Ruby library already supports Rack as an example, and Python will support WSGI soon.

However, Mongrel2 itself doesn't support any of these directly. Doing so would bring back the language specific infections that cause other web servers to go south. The design of most of these protocols tends to be either before the modern web, or specific to one particular language. Instead of trying to cater to all the possible languages out there, Mongrel2 just gives the tools to connect to it yourself.


5.1 Front-end Goodies

Mongrel2 supports your standard web server features like serving files, routing requests to another HTTP server, multiple host matching, good 304 support, and just generally being able to interact with a browser like normal. You've seen most of these features as you setup and deployed a Mongrel2 configuration, but let's go through some of them in more detail so you know what's possible.

5.1.1 HTTP

Mongrel2 uses the original Mongrel parser that powers quite a few other web servers and large, successful websites. This parser is rock solid, dead accurate, and by design blocks a lot of security attacks. For the most part you don't have to worry about this and just need to know Mongrel2 is using the same stable HTTP processing that has been working great for many years.

Another way to put this is if Mongrel2 says your request is invalid, it most definitely is.


Note 8: Idiots and RFC Implementers

I don't know why, but people who implement RFCs pick up very weird cargo cult beliefs peddled by the people who write the standards. In HTTP it was two things which the creators of HTTP have actually back-peddled on: Accept everything, and keep-alives with pipe-lines.

The truth is, if you want a secure server of any kind, blindly accepting every single thing any idiot sends you is going to open your server up to a huge number of attacks. If you look at every attack on existing HTTP servers you'll find that about 80% of them are exploiting ambiguous parts of the HTTP grammar to pass through malicious content or overflow buffers. In Mongrel2 we use a parser that rejects invalid requests from first basic principles using technology that's 30 years old and backed by solid mathematics. Not only does Mongrel2 reject bad requests, but it can tell you why the request was bad, just like a compiler. This doesn't mean Mongrel2 is ruthless, but it definitely doesn't tolerate ambiguity or stupidity.

Mongrel2 completey supports keep-alives because now, since it's not using Ruby at all it can scale up beyond 1024 file descriptors. Ruby was limited in the number of open files a process could have, so the original Mongrel had to break keep-alive and kill connections in order to save itself from greedy browsers that never close them. Mongrel2 doesn't have this limitation, so it uses full keep-alives and has a dead accurate state machine to manage them correctly.

Where problems come in is with pipe-lined requests, meaning a browser sends a bunch of requests in a big blast, then hangs out for all the responses. This was such a horrible stupid idea that pretty much everone gets it wrong and doesn't support it fully, if at all. The reason is it's much too easy to blast a server with a ton of request, wait a bit so they hit proxied backends, and then close the socket. The web server and the backends are now screwed having to handle these requests which will go nowhere.

Mongrel2 does not support pipe-lined requests. It sends one, and waits for the reponse, and if you want more, then tough. Screw you because it has no advantage for Mongrel2 and dubious advantages to you. It is simply one more attack vector for the server and is rejected outright.

These two things are rejected outright by Mongrel2 simply because they are stupid ideas and in 2010 nobody should be writing clients so badly that they need these features.


5.1.2 Proxying

You've already seen configurations that have the Proxy routes working, so it should be easy to understand what's going on. You just create routes to backends that are HTTP servers and Mongrel2 shuttles requests to them, then proxies responses back.

The Proxying support in Mongrel2 is accurate, but it's not very capable right now. For example, there's not round-robin backend selection, or page caching, or other things you might need for more serious deployments. Those features will come eventually, though.

What you do get with Mongrel2's proxying, though, is a dead accurate way of slicing up your application by routes. Other web servers make you go through great pain in order to have some URLs go to a proxy and others go to handlers or directories. They make you use odd “file syntax”, weird pseudo-turing logic if-statements, and other odd hacks to get flexible route selection. They also tend to not maintain keep-alives properly between proxy requests and other requests.

Mongrel2 uses the exact same routing syntax for all backends and has no distinction between them. It also properly does keep-alives for as long as it is efficient to do so.


Note 9: Proxying And 0MQ Handlers Are Like mod_*

A quick note for people coming from other web servers. If you use nginx then you are probably familiar with the concept of proxying to a ”backend” like Ruby on Rails or Django. If you use PHP or another language, you may be used to a system like mod_php which manages your code for you and reloads when you make changes. If you use Apache, then you probably think in terms of ”virtual hosts” and ”mod_rewrite rules”.

In Mongrel2 all the same concepts are there, it's just cleaned up. If you want Mongrel2 to ”nginx/mod_rewrite style” talk to another backend web server, then that's Proxying. If you want to have fast backend handlers then that's 0MQ Handlers.

We really don't have anything like mod_php because the whole idea of embedding a programming language runtime inside Mongrel2 would defeat the point of making it language agnostic.


5.1.3 WebSockets

Mongrel2 supports WebSockets, in that if someone connects with a WebSocket, then it will get sent to handlers that use 0MQ and they can interact with it like every other request. WebSockets are the least supported, though, since they aren't fully baked and nobody's bothered to write helper gear for handlers. They work, in theory, but haven't really been tried in practice.

If you try them out and find things that need fixing, let us know.


Note 10: Death To WebSockets

I am getting sick of these “Backroom RFCs” that are crafted around a bunch of huge companies' crap implementations found in their existing products rather than things real programmers need. WebSockets is yet another example of this, with so many odd features and annoying agendas that I hope it fails miserably.

First, it actually specifies Unicode for the wire protocol in the HTTP headers. This is such a monumental bad idea and will break so many web servers and browsers that I'm baffled how it got into the standard. The only thing I can think of is some “reverse colonialist” who thinks the world should a happy rainbow of Unicode demanded it be in there. Unicode in the wire protocol adds no linguistic value (nobody reads headers), complicates servers, adds security concerns because Unicode is ambiguous, and also violates the existing HTTP standard which specifies ASCII for the wire protocol. As I've said before in this manual, writing protocols is hard enough without having to translate and deal with weird UTF-8 Unicode bizarreness.

Second, there is an idiotic “encryption” mechanism for a key exchange that can only be described as completely damaged and amateurish. The scheme involves taking a regular number and then mixing in non-number characters, so 1234 becomes 1@%^2⋆(34 and then is “decrypted” by just reading only the digits. Yes, this encryption is so advanced you can do it in your head visually. I'm sorry, but any obfuscation that can be done by an 11 year old on paper is not encryption and should not be used at all. Again, this smacks of some “clever” feature one of the many corporations out there invented and not something any real programmer needs.

There's more, but these two alone are enough to decide to hold off on WebSockets until they get their head straight and specify something that mere mortals can hope to implement.


5.1.4 JSSocket

The Mongrel2 chat demo uses JSSocket to do its magic, and it works great, but it requires Flash and, oh, man, do I absolutely hate Flash. However, it works, and works now, and works in every browser, even really old, busted ones. That means it's the first thing we implemented and the one we'll keep for a while until it proves itself not useful. The chat demo we'll cover will show you how to hook this up for fast async messaging and presence detection.

5.1.5 Long Poll

Mongrel2 just works as if everything is an HTTP long poll, it's just that normal request/responses are super fast long polls. For the most part you don't even need to know this exists; it's just how things are and they make perfect sense. You get requests from a certain server with a certain connected identity, and then you send stuff to that target. That's it. If you send it one response, or a stream of them, or setup a long poll configuration, then that's up to you.

5.1.6 Streaming

Because everything in Mongrel2 is asynchronous, and it allows you to target any connected listeners from your handlers, even with partial messages, you can easily do efficient streaming applications. ZeroMQ is an incredibly efficient transport mechanism, and with it you can send tons of information to many browsers or clients at once. This means streaming video and MP3 streams to listeners is very trivial. We'll cover the mp3stream example where you get to see a simple implementation of the ICY MP3 streaming protocol.

5.1.7 N:M Responses

What makes streaming, async messaging, and long poll designs so efficient in Mongrel2 is that you can send one message and target up to 128 clients with that one message. This means sending large scale replies to many browsers requires less copying of the message and less transports.

In addition to this, you can setup Mongrel2 with the help of some 0MQ to send one request from a browser to as many target handlers as you like. You can even send them messages using OpenPGM for sending UDP messages reliably to clusters of computers.

This means that Mongrel2 is the only web server capable of sending one request from a browser to N backends at once, and then return the replies from these handlers to M browsers. Not exactly sure what you could write with that, but it's probably something really damn cool.

5.1.8 Async Uploads

Mongrel2 also solves the problem of large uploads choking your server because you can't stop them before they're complete. Mongrel2 will stream large requests to temporary files, but it sends your handlers an initial “upload started” message. When the upload is done, you get a final “upload finished” message. If, at any time, you want to kill the upload, you just send a 0-length reply (the official KILL MESSAGE) and the whole thing is aborted and cleaned up.

5.2 Introduction to ZeroMQ

Learning to use ZeroMQ is a bit outside the scope of this book, but I'll give you a quick introduction to it so that you can explore it further. It is a very simple system, so hopefully you can figure it out from here.

What ZeroMQ ends up being is “sockets the way programmers think sockets work”. Programmers usually hear about TCP or UDP sockets and they think in a naïve way that they work like this:

TCP
Programmers think these are streamed sequential messages, so if they send a “message” that's 10k long then when the receiver gets it and read from the TCP socket they get a single 10k message in reply. This only works in small messages and not on the Internet, so it's easy to see how it would seem to work that way. The reality is that you can get messages of any size, and without some framing mechanism you wouldn't know where one messages ends or beings. TCP is a streaming protocol.
UDP
Programmers think UDP sockets are single, fast, reliable messages that can be sent to one or more target clients. They at least know UDP has a fixed upper bound size, but they don't get that UDP is very unreliable and that the addressing is fairly weak.

What ZeroMQ does is create an API that looks a lot like sockets, and feels the same, but gives you the messaging styles you actually want. By simply specifying the type of socket when you call zmq_socket you can have multicast, request/reply, and many other styles. Here's the list of ones currently implemented (as documented in man zmq_socket):

ZMQ_REQ/ZMQ_REP
Your classic, strict, REQuest/REPly sockets, that work much like HTTP request/response semantics. These are more strict and lock-step, but are slower because of the extra overhead in keeping the state organized.
ZMQ_PUB/ZMQ_SUB
These are PUBlish/SUBscribe sockets and work as a decentralized asynchronous messaging. PUB sockets just send out messages to multiple subscribers, without expecting a reply. SUB sockets just read messages that are sent to them, and can subscribe to prefixes on the messages.
ZMQ_PUSH/ZMQ_PULL
PUSH/PULL sockets are more like asynchronous round-robin sockets. They work like PUB/SUB, but instead of the message going to all subscribers that are subscribed or not, a PUSH/PULL socket will have a message go to just one of the receivers in a cluster. This is a lot like having a round-robin proxy configuration in your web server.
ZMQ_PAIR
A PAIR is just a direct connection between two peers, basically like a better TCP connection.

The next thing ZeroMQ does is it separates these types of messaging from the underlying transport protocol and gives you a simple URL-like syntax for specifying that: tcp://127.0.0.1:9999/ for example. The types of transports that you can use when you call zmq_bind and zmq_connect are:

tcp://
This is a plain old TCP socket with a host and portnumber.
ipc://
This uses Unix inter-process communication like domain sockets, mq, or whatever is available.
pgm://
Reliable multicast messaging that uses raw IP layering and requires special privileges.
epgm://
The “encapsulated” version that uses regular UDP to do reliable multicast messaging.

Finally, ZeroMQ divorces who binds and connects from the above two configurations. This means that, unlike classic sockets, who connects and who binds doesn't matter for the direction or kind of message. All that really matters is whether the connection makes sense for your application. For example, in Mongrel2 the server does a bind on the handler ports so that handlers can connect at will, which gives your handlers a nice “zero config” setup. Mongrel2 doesn't need to know about handlers; they just need to know about Mongrel2.

How this last part works, and why it works, is that when clients or servers connect or disconnect, ZeroMQ doesn't blow up. In TCP, if I have a server, and then 10 clients connect, when one of them disconnects in the middle of a message I lose the message and get an error. With ZeroMQ, clients connecting to a socket don't trigger an event, and disconnecting doesn't do anything other than hold or drop messages. This is why you don't have to care about who binds or connects, because it's just an addressing mechanism, not a state management mechanism.

Another way to put this feature is there's no accept in ZeroMQ. Clients connect and go at will and the server is just reading messages off when they're available or sending them out for clients.

5.2.1 A Quick Python ZeroMQ Example

I've written a simple abstraction over ZeroMQ that fits the Mongrel2 usage of it, but I think learning how you'd write your own ZeroMQ simple echo server in Python will help you get a handle on it. First the client then the server:


Source 25: Simple Python ZeroMQ Client
import zmq 
 
ctx = zmq.Context() 
s = ctx.socket(zmq.SUB) 
s.connect("tcp://127.0.0.1:5566") 
s.setsockopt(zmq.SUBSCRIBE, '') 
 
msg = s.recv() 
print "MSG: ", repr(msg)


Source 26: Simple Python ZeroMQ Server
import zmq 
import time 
 
ctx = zmq.Context() 
s = ctx.socket(zmq.PUB) 
s.bind("tcp://127.0.0.1:5566") 
 
while True: 
    s.send("HELLO") 
    time.sleep(1)

You can then run these two in different windows and they will talk to each other. Try playing around with different socket types and transports to see what they do. Notice that for a PUB/SUB setup we have to use setsockopt to subscribe the nothing (''). This is the same no matter what language you use.

Here's an example that does the same thing but with REQ/REP style of messages.


Source 27: ZeroMQ REQ/REP Client
import zmq 
 
ctx = zmq.Context() 
s = ctx.socket(zmq.REQ) 
s.connect("tcp://127.0.0.1:5566") 
 
s.send('HI FROM CLIENT') 
 
msg = s.recv() 
print "MSG: ", repr(msg)


Source 28: ZeroMQ REQ/REP Server
import zmq 
 
ctx = zmq.Context() 
s = ctx.socket(zmq.REP) 
s.bind("tcp://127.0.0.1:5566") 
 
while True: 
    print "GOT BACK", repr(s.recv()) 
    s.send("HELLO")

As you can see when you run this, it's more like your classic web server style of messaging, where a client requests something with an initial message, and the server replies. Try getting the order wrong and see how ZeroMQ aborts and tells you it's wrong. REQ sockets must send first then recv, and REP sockets must recv then send.


Note 11: There's Always a Size in ZeroMQ Land

The lack of a reliable framing mechanism in TCP was a crime against humanity. What I mean by a “frame” is a simple indicator that a message in a stream has a certain length. If you preface each message in TCP with the length of the data you're about to send then you avoid all manner of annoyance, attacks, and bugs. Something as simple as a single byte that says you're sending up to 128 more bytes, with an extra bit to indicate the last byte would have saved the world much pain.

This is basically what ZeroMQ has done, and so much more. ZeroMQ pulls out all the tricks to make sure that the message you receive is totally cooked, fully sized, and transports it faster than TCP can actually send it. It does this by framing things intelligently, using compression, reducing copying, and just generally being awesome.

Of course, the only limitation is that it can't really stream things. But then again, nobody really does true streaming. They always end up having to bolt on some framing of some kind.


5.3 Handler ZeroMQ Format

You've had the world's fastest crash course in ZeroMQ and now you're ready to see how Mongrel2 talks to your handlers with it. I won't really call this a “protocol”, since ZeroMQ is really doing the protocol, and we just pull fully baked messages out of it. Instead, this is just a format, as if you got strings out of a file or something similar. This message format is designed to accomplish a few things in the simplest way possible:

  1. Be usable from languages that are statically compiled or scripting languages.
  2. Be safe from buffer overflows if done right, or easy to do right.
  3. Be easy to understand and require very little code.
  4. Be language agnostic and use a data format everyone can accept without complaining that it should be done with their favorite1 .
  5. Be easy to parse and generate inside Mongrel2 without have to parse the entire message to do routing or analysis.
  6. Be useful within ZeroMQ so that you can do subscriptions and routing.

To satisfy these features we use haveo types of ZeroMQ sockets (soon to be configurable), a request format that Mongrel2 sends and a response format that the handlers send back. Most importantly, there is nothing about the request and response that must be connected. In most cases they will be connected, but you can receive a request from one browser and send a response to a totally different one.

5.3.1 Socket Types Used

First, the types of ZeroMQ sockets used are a ZMQ_PUSH socket for messages from Mongrel2 to Handlers, which means your Handler's receive socket should be a ZMQ_PULL. Mongrel2 then uses a ZMQ_SUB socket for receiving responses, which means your Handlers should send on a ZMQ_PUB socket. This setup allows multiple handlers to connect to a Mongrel2 server, but only one Handler will get a message in a round-robin style. The PUB/SUB reply sockets, though, will let Handlers send back replies to a cluster of Mongrel2 servers, but only the one with the right subscription will process the request.2

In the various APIs we've implemented, you don't need to care about this. They provide an abstraction on top of this, but it does help to know it so that you understand why the message format is the way it is.

This leads to rule number 1:

Rule 1: Handlers receive on with PULL and send with PUB sockets.

5.3.2 UUID Addressing

Do you remember all those UUIDs all over the place in the configuration files? They may have seemed odd, but they identify specific server deployments and processes in a cluster. This will let you identify exactly which member of a cluster sent a message, so that you can return the right reply. This is the first part of our protocol format and it results in the next rule 2:

Rule 2: Every message to and from Mongrel2 has that Mongrel2 instance's UUID as the very first thing.

5.3.3 Numbers Identify Listeners

You then need a way to identify a particular listener (browser, client, etc.) that your message should target, and Mongrel2 needs to tell you who is sending your handler the request. This means Mongrel2 sends you is just one identifier, but you can send Mongrel2 a list of them. This leads to rule 3:

Rule 3: Mongrel2 sends requests with one number right after the server's UUID separated by a space. Handlers return a netstring with a list of numbers separated by spaces. The numbers indicate the connected browser the message is to/from.

In case you don't know what a netstring is, it is a very simple way to encode a block of data such that any language can read the block and know how big it is. A netstring is, simply, SIZE:DATA,. So, to send “HI”, you would do 2:HI,, and it is incredibly easy to parse in every language, even C. It is also a fast format and you can read it even if you're a human.

5.3.4 Paths Identify Targets

In order to make it possible to route or analyze a request in your handlers without having to parse a full request, every request has the path that was matched in the server as the next piece. That gives us:

Rule 4: Requests have the path as a single string followed by a space and no paths may have spaces in them.

5.3.5 Request Headers And Body

We only have two more rules to complete the message format.

Rule 5: Mongrel2 sends requests with a netstring that contains a JSON hash (dict) of the request headers, and then another netstring with the body of the request.

Then there's a similar rule for responses:

Rule 6: Handlers return just the body after a space character. It can be any data that Mongel2 is supposed to send to the listeners.

HTTP headers, image data, HTML pages, streaming video…You can also send as many as you like to complete the request and any handler can send it.

5.3.6 Complete Message Examples

Now, even though we laid out all of this as a series of rules, the actual code to implement these is very simple. First here's a simple “grammar” for how a request that gets sent to your handlers is formatted:

UUID ID PATH SIZE:HEADERS,SIZE:BODY,

That's obviously a much simpler way to specify the request than all those rules, but it also doesn't tell you why. The above description, while boring as hell, tells you why each of these pieces exist.

To parse this in Python we simply do this:


Source 29: Parsing Mongrel2 Requests In Python
import json 
 
def parse_netstring(ns): 
    len, rest = ns.split(':', 1) 
    len = int(len) 
    assert rest[len] == ',', "Netstring did not end in ','" 
    return rest[:len], rest[len+1:] 
 
def parse(msg): 
    sender, conn_id, path, rest = msg.split(' ', 3) 
    headers, rest = parse_netstring(rest) 
    body, _ = parse_netstring(rest) 
 
    headers = json.loads(headers) 
 
    return uuid, id, path, headers, body

This is actually all of the code needed to parse a request, and is fairly the same in many other languages. If you look at the file examples/python/mongrel2/request.py, you'll see a more complete example of making a full request object.

A response is then just as simple and involves crafting a similar setup like this:

UUID SIZE:ID ID ID, BODY

Notice I've got three IDs here, but you can do anywhere from 1 up to 128. Generating this is very easy in Python:


Source 30: Generating Responses
 
def send(uuid, conn_id, msg): 
    header = "%s %d:%s," % (uuid, len(str(conn_id)), str(conn_id)) 
    self.resp.send(header + ' ' + msg) 
 
 
def deliver(uuid, idents, data): 
    self.send(uuid, ' '.join(idents), data)

That, again, is all there is to it. The send method is the one doing the real work of crafting the response, and the deliver method is just using send to do all the the target idents joined with a space.

5.3.7 Python Handler API

Instead of building all of this yourself, I've created a Python library that wraps all this up and makes it easy to use. Each of the other libraries are designed around the same idea and should have a similar design. To check out how to use the Python API, we'll take a look at each of the demos that are available. These are the same demos you ran in the previous section to create a sample deployment.

For the Python API, you may want to start by looking at two very small files that should be able to understand quickly: examples/python/mongrel2/request.py and examples/python/mongrel2/handler.py.

5.4 Basic Handler Demo

The most basic handler you can write is in the examples/http_0mq/http.py file and it just the simplest thing possible:3


Source 31: http.py example
 
from mongrel2 import handler 
import json 
 
sender_id = "82209006-86FF-4982-B5EA-D1E29E55D481" 
 
conn = handler.Connection(sender_id, "tcp://127.0.0.1:9997", 
                          "tcp://127.0.0.1:9996") 
while True: 
    print "WAITINGFORREQUEST" 
 
    req = conn.recv() 
 
    if req.is_disconnect(): 
        print "DISCONNECT" 
        continue 
 
    if req.headers.get("KILLME", None): 
        print "Theywanttobekilled." 
        response = "" 
    else: 
        response = "<pre>\nSENDER:%r\nIDENT:%r\nPATH:%r\nHEADERS:%r\nBODY:%r</pre>" % ( 
            req.sender, req.conn_id, req.path, 
            json.dumps(req.headers), req.body) 
 
        print response 
 
    conn.reply_http(req, response)

All this code does is print back a simple little dump of what it received, and it's not even a valid HTML document. Let's walk through everything that's going on:

  1. Import the handler module from mongrel2 and json. The json module is really only used for logging.
  2. Establish the UUID for our handler, and create a connection. It's not really a connection but more of a “virtual circuit” that you can just pretend is a connection. It's using all ZeroMQ and the protocol we just described to create a simple API to use.
  3. Go into a while loop forever and recv request objects off the connection.
  4. One type of special message we can get from Mongrel2 is a “disconnect” message, which tells you that one of the listeners you tried to talk to was closed. You should either ignore those and read another, or update any internal state you may have. They can come asynchronously, and for the most part you can ignore them unless you need to keep them open as in, say, a chat application or streaming.
  5. Craft the reply you're going to send back, which is just a dump of what you received.
  6. Send this reply back to Mongrel2. Notice the subtle difference where you include the req object as part of how you reply? This is the major difference between this API and more traditional request/response APIs in that you need the request you are responding to so that it knows where to send things. In a normal socket-based server this is just assumed to be the socket you're talking about.

This is all you need at first to do simple HTTP handlers. In reality, the reply_http method is just syntactic sugar on crafting a decent HTTP response. Here's the actual method that is crafting these replies:


Source 32: HTTP Response Python Code
def http_response(body, code, status, headers): 
    payload = {'code': code, 'status': status, 'body': body} 
    headers['Content-Length'] = len(body) 
    payload['headers'] = "\r\n".join('%s: %s' % (k,v) for k,v in 
                                     headers.items()) 
 
    return HTTP_FORMAT % payload

Which is then used by Connection.reply_http and Connection.deliver_http to send an actual HTTP response. That means all this is doing is creating the raw bytes you want to go to the real browser, and how it's delivered is irrelevant. For example, the deliver_http method means that, yes, you can have one handler send a single response to target multiple browsers at once.

5.5 Async File Upload Demo

Mongrel2 uses an asynchronous method of doing uploads that helps you avoid receiving files you either can't accept or shouldn't accept. It does this by sending your handler an initial message with just the headers, streaming the file to disk, and then a final message so you can read the resulting file. If you don't want the upload, then you can send a kill message (a 0 length message) and the connection closes, and the file never lands.

The upload mechanism works entirely on content length, and whether the file is larger than the limits.content_length. This means if you don't want to deal with this for most form uploads, then just set limits.content_length high enough and you won't have to.

However, if you want to handle file uploads or large requests, then you add the setting upload.temp_store to a mkstemp compatible path like /tmp/mongrel2.upload.XXXXXX with the XXXXXX chars being replaced with random characters. It doesn't have to /tmp either, and can be any store you want, network disk, anything.

Here's an example handler in examples/http_0mq/upload.py that shows you how to do it:


Source 33: Async Upload Example
 
from mongrel2 import handler 
import json 
import hashlib 
 
sender_id = "82209006-86FF-4982-B5EA-D1E29E55D481" 
 
conn = handler.Connection(sender_id, "tcp://127.0.0.1:9997", 
                          "tcp://127.0.0.1:9996") 
while True: 
    print "WAITINGFORREQUEST" 
 
    req = conn.recv() 
 
    if req.is_disconnect(): 
        print "DISCONNECT" 
        continue 
 
    elif req.headers.get('X-Mongrel2-Upload-Done', None): 
        expected = req.headers.get('X-Mongrel2-Upload-Start', "BAD") 
        upload = req.headers.get('X-Mongrel2-Upload-Done', None) 
 
        if expected != upload: 
            print "GOTTHEWRONGTARGETFILE:", expected, upload 
            continue 
 
        body = open(upload, 'r').read() 
        print "UPLOADDONE:BODYIS%dlong,contentlengthis%s" % ( 
            len(body), req.headers['Content-Length']) 
 
        response = "UPLOADGOOD:%s" % hashlib.md5(body).hexdigest() 
 
    elif req.headers.get('X-Mongrel2-Upload-Start', None): 
        print "UPLOADstarting,don'treplyyet." 
        print "Willreadfilefrom%s." % req.headers.get('X-Mongrel2-Upload-Start', None) 
        continue 
 
    else: 
        response = "<pre>\nSENDER:%r\nIDENT:%r\nPATH:%r\nHEADERS:%r\nBODY:%r</pre>" % ( 
            req.sender, req.conn_id, req.path, 
            json.dumps(req.headers), req.body) 
 
        print response 
 
    conn.reply_http(req, response)

You can test this with something like curl -T tests/config.sqlite http://localhost:6767/handlertest to upload a big file.

What's happening is the following process:

  1. Mongrel2 receives a request from a browser (or curl in this case) that is greater than limits.content_length in size. It actually doesn't read all of it yet, only about 2k.
  2. Mongrel2 looks up the upload.temp_store setting and makes a temp file there to write the contents. If you don't have this setting then it aborts and returns an error to the browser.
  3. Mongrel2 sees that the request is for a Handler, so it crafts an initial request message. This request message has all the original headers, plus a X-Mongrel2-Upload-Start header with the path of the expected tmpfile you will read later.
  4. Your handler receives this message, which has no actual content, but the original content length, all the headers, and this new header to indicate an upload is starting.
  5. At this point, your handler can decide to kill the connection by simply responding with a kill message, or even with a valid HTTP error reponse then a kill message.
  6. Otherwise your handler does nothing, and Mongrel2 is already streaming the file into the designated tmpfile for this upload.
  7. When the upload is finally saved to the file, it adds a new header of X-Mongrel2-Upload-Done set to the same file as the first header. Remember that both headers are in this final request.
  8. Your handler then gets this final request message that has both the X-Mongrel2-Upload-Start and X-Mongrel2-Upload-Done headers, which you can then use to read the upload contents. You should also make sure the headers match to prevent someone forging completed uploads.


Note 12: Watch The chroot Too

Remember, when you run Mongrel2 it will store the file relative to its chroot setting. In testing you probably aren't running Mongrel2 as root so it works fine. You just then have to make sure that your handler know to look for the file in the same place. So if you have /var/www/mongrel2.org for your chroot and /uploads/file.XXXXXX then the actual file will be in /var/www/mongrel2.org/uploads/file.XXXXXX. The good thing is you can read the config database in your handlers and find out all this information as well.


5.6 MP3 Streaming Demo

The next example is a very simple and, well, kind of poorly implemented MP3 streaming demo that uses the ICY protocol. ICY is a really lame protocol that was obviously designed before HTTP was totally baked and probably by people who don't really get HTTP. It works in an odd way of having meta-data sent at specific sized intervals so the client can display an update to the meta-data.

The mp3streamer demo creates a streaming system by having a thread that receives requests for connections, and then another thread that sends the current data to all currently connected clients. Rather than go through all the code, you can take a look at the main file and see how simple it is once you get the streaming thread right:


Source 34: Base mp3stream Code
 
from mp3stream import ConnectState, Streamer 
from mongrel2 import handler 
import glob 
 
 
sender_id = "9703b4dd-227a-45c4-