Mongrel2 is designed to be easy to deploy and automate the deployment. This is why it uses SQLite to store the configuration, but m2sh as an interface to creating the configuration. Doing this lets you access the configuration using any language that works for you, augment it, alter it, migrate it, and automate it.
In this chapter, I’m going to show you how to make a basic configuration using m2sh and all the commands that are available. You’ll learn how the configuration system is structured so that you know what goes where, but in the end it’s just a simple storage mechanism.
When I first started talking about Mongrel2, I said I’d store the configuration in SQLite and do a Model-View-Controller kind of design. Immediately, people who can’t read flipped out and thought this meant they’d be back in “Windows registry hell”, but with SQL as their only way to access it. They thought that they’d be stuck writing configurations with SQL; that SQL couldn’t possibly configure a web server.
They were wrong on many levels. Nobody was ever going to make anyone use SQL. That was repeated over and over but, again, people don’t read and love spreading FUD. The SQLite config database is nothing like the Windows Registry. No other web server really uses a true hierarchy; they just cram a relational model into a weirdo configuration format. The real goal was to make a web server that was easy to manage from any language, and then give people a nice tool to get their job done without having to ever touch SQL. EVER!
In the end, what we got despite all this fear mongering is a bad ass configuration tool and a design that is simple, elegant, and works fantastically. If you read that Mongrel2 uses SQLite and thought this was weird, well, welcome to the future. Sometimes it’s weird out here (even though Postfix has been doing this for a decade or more).
When you hear Model-View-Controller, you think about web applications. This is a design pattern where you place different concerns into different parts of your system and try not to mix them too much. For an interactive application, if you keep the part that stores data (Model) separated from the logic (Controller) and use another piece to display and interact with the user (View), then it’s easier to change the system and adapt it over time to new features.
The power of MVC is simply that these things really are separate orthogonal pieces that get ugly if they’re mixed together. There’s no math or theory that says why; just lots of experience has told us it’s usually a bad idea. When you start mixing them, you find out that it’s hard to change for new requirements later, because you’ve sprinkled logic all over your web pages. Or you can’t update your database because there’s all these stored procedures that assume the tables are a certain way.
Mongrel2 needed a way to allow you to use various languages and tools to automate its configuration. Letting you automate your deployments is the entire point of the server. The idea was that if we gave you the Controller and the Model, then you can craft any View you wanted, and there’s no better Model than a SQL database like SQLite: it’s embeddable, easily accessed from C or any language, portable, small, fast enough and full of all the features you need and then some.
What you are doing when you use m2sh (from tools/m2sh) to configure a configuration for Mongrel2, is working with a View we’ve given you to create a Model for the Mongrel2 server to work with. That’s it, and you can create your own View if you want. It could be automated deployment scripts, a web interface, monitoring scripts, anything you need.
The point is, if you just want to get Mongrel2 up and running, then use m2sh. If you want to do more advanced stuff, then get into the configuration database schema and see what you can do. The structure of the database very closely matches Mongrel2’s internal structure, so understanding that means you understand how Mongrel2 works. This is a vast improvement over other web servers like Apache where you’ve got no idea why one stanza has to go in a particular place, or why information has to be duplicated.
With Mongrel2, it’s all right there.
To give this configuration system a try, you just need to run the test configuration used in the unit tests. Let’s try doing a few of the most basic commands with this configuration.
First, make sure you are in the mongrel2 source and you’ve ran the build so that you get the tests/config.sqlite file primed. This is our base test case that we use in unit testing. After you have that, do this:
# get list of the available servers to run m2sh servers -db tests/config.sqlite # see what hosts a server has m2sh hosts -db tests/config.sqlite -server test # find out if a server named ’test’ is running m2sh running -db tests/config.sqlite -name test # start a server whose default host is ’localhost’ m2sh start -db tests/config.sqlite -host localhost
At this point, you should have seen lists of servers and hosts, seen that mongrel2 is not running, and then started it. You can find out about all the commands and get help for them with m2sh help or ms2h help --on command.
You can now try doing some simple starting, stopping and reloading using sudo (make sure you CTRL-c to exit from the previous start command):
# start it so it runs in the background via sudo m2sh start -db tests/config.sqlite -host localhost -sudo tail logs/error.log # reload it m2sh reload -db tests/config.sqlite -host localhost tail logs/error.log # hit is with curl to see it do the reload curl http://localhost:6767/ tail logs/error.log # see if it’s running then stop it m2sh running -db tests/config.sqlite -host localhost m2sh stop -db tests/config.sqlite -host localhost
If m2sh start runs fine, but m2sh start -sudo fails, you may need to link /proc in your chroot, using mkdir -p proc && sudo mount --bind /proc proc. You can also try installing ZeroMQ from source, if you’d rather avoid putting your /proc where it might get seen.
Awesome, right? Using just this one little management tool you are able to completely manage a Mongrel2 instance without having to hack on a config file at all. But you probably need to know how this is all working anyway.
You now have done nearly everything you can to a configuration, but you might not know exactly what’s going on. Here’s an explanation of what’s going on behind the scenes:
All of this is happening by reading the tests/config.sqlite file and not reading any configuration files. You can now try building your own configuration that matches this one or some others.
To configure a new config database you’ll write a file that looks a lot like a configuration file. It looks like a Python file, because it comes from the first m2sh we wrote in Python (living in examples/python), but now it’s written in C. Even though it was rewritten, we managed to keep the same format —and even make it a little easier by making commas optional in most places.
First you load your configuration into a fresh database using m2sh load. For our example, we’ll use the example configuration from examples/configs/sample.conf to make a simple one:
If you aren’t familiar with Python, then this code might look freaky, but it’s really simple. We’ll get into how it’s structured in a second, but to load this file we would just do this:
m2sh load -config examples/configs/sample.conf ls -l config.sqlite m2sh servers m2sh hosts -server test m2sh start -name test
Notice that we didn’t have to tell m2sh that the database was config.sqlite. It assumes that is the default, as well as that mongrel2.conf is the config file you want. If you use those two files, then you never have to type those parameters again.
With this sequence of commands you:
By now you should be getting the hang of the pattern here, which is to use m2sh and a configuration “script” to generate .sqlite files that Mongrel2 understands.
The base structure of a Mongrel2 configuration is:
Each of these nested “objects” then has a set of attributes you can use to configure them, and most of them have reasonable defaults.
The server is all about telling Mongrel2 where to listen on its port, where to chroot, and general server specific deployment gear.
A host is matched using a kind of inverse route that matches the ending of Host: headers against a pattern. You’ll see how this works when we talk about routes, but for now you just need to know that request to the Server.port are routed based on these Host configurations the Server contains.
The Route is the workhorse of the whole system. It uses some very fancy but still simple code in Mongrel2 to translate Host: headers to Hosts and URL paths to Handlers, Dirs, and Proxies.
Later on, you’ll learn about the pattern matching that’s used, but it’s basically a stripped down version of your normal regular expressions, but with a few convenient syntaxes for doing simple string matching. When you configure a route, you write something like /images/(.*.jpg) and the part before the ‘(’ is used as a fast matched prefix, while the part after it is considered a pattern to match. When a request comes in, Mongrel2 quickly finds the longest prefix that matches the URL, and then tests its pattern if there is one. If the pattern is valid, the request goes through. If not, 404.
A Dir is a simple directory-serving route target that serves files out of a directory. It has caching built-in, handles if-modified-since, ETags, and all the various bizarre HTTP caching mechanisms as RFC-accurately as possible. It also has default content-types and index files.
Currently, we don’t offer more parameters for configuration, but eventually you’ll be able to tweak more and more of the settings to control how Dirs work.
A proxy is used so that you can use Mongrel2 but not have to throw out your existing infrastructure. Mongrel2 goes to great pains to make sure that it implements a fast and dead-accurate proxy system internally, but no matter how good it is, it can’t compete with ZeroMQ handlers. The idea with giving Proxy functionality is that you can point Mongrel2 at existing servers, and then slowly carve out pieces that will work as handlers.
Requests that match a Proxy route are still parsed by Mongrel2’s incredibly accurate HTTP parser, so that your backend servers should not receive badly formatted HTTP requests. Responses from a Proxy server, however, are sent unaltered to the browser directly.
Now we get to the best part: the ZeroMQ Handlers that will receive asynchronous requests from Mongrel2. You need to use the ZeroMQ syntax for configuring them, but this means with one configuration format you can use handlers that are using UDP, TCP, Unix, or PGM transports. Most testing has been done with TCP transports.
The interesting thing about the Handler configuration is that you don’t have to say where the actual backend handlers live. Did you notice you aren’t declaring large clusters of proxies, proxy selection methods, or anything else, other than two 0MQ endpoints and some identifiers? This is because Mongrel2 is binding these sockets and listening. Mongrel2 doesn’t actively connect to backends; they connect to Mongrel2. This means, if you want to fire up 10 more handlers, you just start them; no need to restart or reconfigure Mongrel2 to make them active.
There’s also Log, MIMEType, and Setting objects/tables you can work with, but we’ll get into those later, since you don’t need to know about them to understand the Mongrel2 structure.
All of this knowledge about the Mongrel2 configuration structure can now be used to take a look at a more complex example. We’ll take a look at this example and I’ll just say what’s going on, and you try to match what I’m saying to the code. Here’s the examples/configs/mongrel2.conf file:
If you haven’t guessed yet, this configuration is what’s used on http://mongrel2.org to configure the main test system. In it we’ve got the following things to check out:
And that, my friends, is the most complex configuration we have so far.
The pattern code was taken from Lua and is some of the simplest code for doing fast pattern matches. It is very much like regular expressions, except it removes a lot of features you don’t need for routes. Also, unlike regular expressions, URL patterns always match from the start. Mongrel2 uses them by breaking routes up into a prefix and pattern part. It then uses routes to find the longest matching prefix and then tests the pattern. If the pattern matches, then the route works. If the route doesn’t have a pattern, then it’s assumed to match, and you’re done.
The only caveat is that you have to wrap your pattern parts in parenthesis, but these don’t mean anything other than to delimit where a pattern starts. So instead of /images/.*.jpg, write /images/(.*.jpg) for it to work.
Here’s the list of characters you can use in your patterns:
Using the uppercase version of an escaped character makes it work the opposite way (e.g., ' A matches any character that isn’t a letter). The backslash can be used to escape the following character, disabling its special abilities (e.g., ' ' will match a backslash).
Anything that’s not listed here is matched literally.
Yep, I get it. You think that everyone should use UTF-8 or some Unicode encoding for everything. You despise the dominance of the ‘A’ in ASCII and hate that you can’t put your spoken language right in a URL.
Well, I hate to say it, but tough. Protocols are hard enough without having to worry about the bewildering mess that is Unicode. When you sit down to write a network protocol, the last thing you need is a format that’s inconsistent, has multiple interpretations, can’t be properly capitalized or lowercased, and requires extra translation steps for every operation. With ASCII, every computer just knows what it is, and it’s the fastest for creating wire protocol formats.
This is why, on the Internet, you have to do things to URLs to make them ASCII, like encoding them with % signs. It’s in the standard, and it’s the smart thing to do. I don’t want to have to know the difference between the various accents in your spoken language to route a URL around. I just want to deal with a fixed set of characters and be done with it. Don’t blame me or Mongrel2 for this, it’s just the way the standard is and the way to get a server that is stable and works.
Protocols work better when there’s less politics in their design. This means you can’t put Unicode into your URL patterns. I mean, you can try; but the behavior is completely undefined.
Here are some example routes you can try to get a feel for the system:
That should give the idea of how you can use them. Notice also that I’m using the Python "blah" string syntax which is interchangeable with the ’blah’ syntax so I don’t have to double escape everything.
The routing algorithm is actually kind of simple, but it’s an unfamiliar algorithm to most programmers. I won’t go into the details of how a “Ternary Search Tree” works, but basically it lets you match one prefix against a bunch of other strings very fast. This data structure lets Mongrel2 very quickly determine the target for a route, and also know if it has a route at all. Typically, it can match a route in just a few characters, and reject a route in even fewer.
For practical usage, it’s better to just read how it works, rather than how it’s implemented. Here’s how Mongrel2 matches an incoming URL against routes you’ve given it:
That example should show you how routes work, and the important thing to realize is that they’ll try to match the “longest first route” as what we call the “best” route. If you get unexpected routing behavior, then you’ll want to just make them explicit by putting a pattern at the end.
Finally, here’s some examples directly from the unit test that we have for the routing system. Imagine we have these routes:
Then this is how a set of example requests would match:
Work through those in your head so you make sure you understand them.
Mongrel2 works with Flash sockets out of the box (with WebSockets coming soon) and can handle either XML messages or special JSON messages. It does this by modifying the parser it has internally to parse out HTTP or (exclusive) XML and JSON messages. This feature can be used by any TCP client, not just Flash, it just happens to be a simple way to send simple async messages without using HTTP.
To make it work, there’s a slight modification to the routes used by JSON or XML messages. Basically, JSON routes start with a ’@’ and XML routes start with a ’<’ and both must be terminated with a NUL byte ’' 0’. When the parser sees these at the beginning of a request, it parses that message and sends it “as-is” to your target handler.
Let’s look at two examples from the chat demo and from some test suites:
The first one will take any Flash (or just TCP connection) that sends lines like @chat {"msg": "hello"}' 0 and route those to the chat_demo handler. You can connect, and then just stream these JSON messages all you want, and handlers can send back the same responses. In fact, as long as you don’t include a ’' 0’ character, you could probably send anything you want.
The second route will take any XML that is wrapped in a <test> tag and send that to your handlers. That means you can send <test name="joe"><age>21</age></test> and it will send it to xml_demo.
This is powerful because Mongrel2 now becomes a generic XML or JSON messaging server very easily. For example, I wrote a simple little BBS demo with Mongrel2 and wrote a very basic terminal client in Python for people to use instead of the browser. Look at examples/bbs/client.py to see how that works in full, but the meat of it is:
CONN = socket.socket() CONN.connect((host, port)) def read_msg(): reply = "" ch = CONN.recv(1) while ch != ’' 0’: reply += ch ch = CONN.recv(1) return json.loads(b64decode(reply)) def post_msg(data): msg = ’@bbs %s' x00’ % ( json.dumps({’type’: ’msg’, ’msg’: data})) CONN.send(msg)
In that code, notice how (for historical reasons due to Flash sucking) the response is base64 encoded, but your handler doesn’t have to do that. You can just adopt the same protocol back. Other than that, the BBS example client is just opening a socket and sending message, but Mongrel2 is converting them to messages to backend handlers for processing.
Finally, here’s the grammar rules in the parser for handling these messages:
rel_path = ( path? (";" params)? ) ("?" query)?; SocketJSONStart = ("@" rel_path); SocketJSONData = "{" any* "}" :>> "' 0"; SocketXMLData = ("<" [a-z0-9A-Z' -.]+) ("/" | space | ">") any* ">" :>> "' 0"; SocketJSON = SocketJSONStart " " SocketJSONData; SocketXML = SocketXMLData; SocketRequest = (SocketXML | SocketJSON);
If you read that carefully, you’ll see you can actually pass query strings and path parameters to your JSON socket handlers. That’s currently not used, but in the future we might.
One caveat to this whole feature is these targets can only be routed to the Server.default_host of the server. There’s not enough information in these routes to determine a target host (like the Host: header in HTTP) so you can only send it to the default target host.
A very nice feature for people doing operations work is that m2sh keeps track of all the commands you run on it while you work, and lets you add little commit logs to the log for documentation later. These commit logs are then maintained even across m2sh load commands so you can see what’s going on. They track who did something, what server they did it on, what time they did it and what they did.
To see the logs for your own tests, just do m2sh log -db simple.sqlite and then, if you want to add a commit log message, you use the m2sh commit command. Here’s an example from mongrel2.org:
The motivation for this feature is the trend that ops stores server configurations in revision control systems like git or etckeeper. This works great for holding the configuration files, but it doesn’t tell you what happened on each server. In many cases, the configuration files also need to be reworked or altered for each deployment. With the m2sh log and commit system, you can augment your revision control with deployment action tracking.
Later versions of Mongrel2 will keep small amounts of statistics which will link these actions to changes in Mongrel2 behavior like frequent crashing, failures, slowness, or other problems.
Basically, there’s nowhere to hide. Mongrel2 will help operations figure out who needs to get fired the next time Twitter goes down.
Just before the release of 1.0, we added a feature called the “Control Port”, which lets you connect to a running Mongrel2 server over a unix (domain) socket and give it control commands. These commands let you get the status of running tasks, lists of currently connected sockets and how long they’ve been connected, the server’s current time and kill a connection. Using this control port, you can then implement any monitoring and timeout policies you want, and provide better status.
By default, the control port is in your chroot at run/control, but you can set the control_port setting to change this. You can actually change it to any ZeroMQ valid spec you want, although you’re advised to use IPC for security.
Once Mongrel2 starts, you can then use m2sh to connect to Mongrel2 and control it using the simple command language. Currently, what you get back is very raw, but it will improve as we work on the control port and what it does.
The list of commands you can issue are:
You then use the control port by running m2sh:
m2sh control -every m2 [test]> help name help stop stop the server (SIGINT) reload reload the server help this command control_stop stop control port kill kill a connection status status, what=[’net’|’tasks’] terminate terminate the server (SIGTERM) time the server’s time uuid the server’s uuid info information about this server m2 [test]> info port: 6767 bind_addr: 0.0.0.0 uuid: f400bf85-4538-4f7a-8908-67e313d515c2 chroot: ./ access_log: .//logs/access.log error_log: /logs/error.log pid_file: ./run/mongrel2.pid default_hostname: localhost m2 [test]>
The protocol to and from the control socket is a simple tnetstring in and out that any langauge can read. Here’s a nearly complete Python client that is using the control port:
import zmq from mongrel2 import tnetstrings from pprint import pprint CTX = zmq.Context() addr = "ipc://run/control" ctl = CTX.socket(zmq.REQ) print "CONNECTING" ctl.connect(addr) while True: cmd = raw_input("> ") # will only work with simple commands that have no arguments ctl.send(tnetstrings.dump([cmd, {}])) resp = ctl.recv() pprint(tnetstrings.parse(resp)) ctl.close()
You obviously don’t need to do this, but should you want to do something special like a management interface, this is your start.
A Mongrel2 process itself does not have any support for running multiple servers; instead, it takes two simple parameters: a sqlite config database and a server uuid that names the server to be launched. This is done to keep the mongrel2 code simple and workable.
However.
Mongrel2’s m2sh does support launching multiple servers from a single configuration database. By passing -every to many m2sh commands, you are able to perform actions on all configured servers at once. You can also perform actions on single servers by specifying their uuid, name or host. If any parameter given is ambiguous (that is if, for example, you search with -host localhost and your config contains two servers which attempt to bind to localhost), m2sh will list the matching servers and ask you to clarify your selection.
For example:
Many of Mongrel2’s internal settings are configurable using the settings system. Some of these are dangerous to mess with, so make sure you test any changes before you try to run them. Setting them to 0 or negative numbers isn’t checked, so if you make a setting and things go crazy, you need to not make that setting. All of these have good defaults so you can leave them alone unless you need to change them.
To configure your settings, you set the variable settings and you’re done:
Mongrel2 will read these on the fly and write INFO log messages telling you what the settings are so you can debug them if they cause problems. The list of available settings are:
You can also update your mimetypes in the same way, just set a variable with them:
Mongrel2 now supports SSL, with preliminary support for SSL session caching. As of v1.8.0 (actually earlier) you can enable SSL very easily for your Mongrel2 server. Mongrel2 configures SSL certs with two options in settings, and then a directory of .crt and .key files named after the UUID of the servers that need them.
To get started, you can make a simple self-signed certificate with some weak encryption and setup your certs directory:
# make a certs directory mkdir certs # list out your servers so you can get the UUID m2sh servers # go into the certs directory cd certs # make a self-signed weak cert to play with openssl genrsa -des3 -out server.key 512 openssl req -new -key server.key -out server.csr cp server.key server.key.org openssl rsa -in server.key.org -out server.key openssl x509 -req -days 365 -in server.csr -signkey server.key -out server.crt # finally, copy the sesrver.crt and server.key files over to the UUID for that # server configuration in your mongrel2.conf mv server.crt 2f62bd5-9e59-49cd-993c-3b6013c28f05.crt mv server.key 2f62bd5-9e59-49cd-993c-3b6013c28f05.key
I actually have a shell script kind of like this, since I can never remember how to set this stuff up with openssl. Also, you should really adjust the RSA key strength from 512 to something you’re comfortable with. I’m using a weak key here so you can do performance testing and thrashing, and then compare with your real key later.
Once you have that done, you just have to add three little settings to your mongrel2 conf:
After that, your config should look something like this:
Get that written, rerun m2sh config to make the new config, restart Mongrel2 (you can’t reload to enable SSL), and it should be working.
After you get this working you just have to get your own certificate, put it in the certs directory with the right filename, and you should be good to go.
Starting with 1.9.0, mongrel2 adds SNI support. On any request, mongrel2 will first search for files named hostname.crt and hostname.key – if found, those will be used, otherwise it will fallback to the UUID key and certificate files.
We’ve got experimental SSL caching working, which will try to reuse the browser’s SSL session if it’s there. This is meant to be a trade-off between memory and performance, so it can chew a bunch of RAM if you have a lot of SSL traffic over a short period of time. We’ll be making the caching more configurable, but for now, it’s working and does speed up SSL clients that do it properly.
The Mongrel2 v1.8.0 release also included working filters that you can configure and load dynamically. The filters are very fresh, and the only one available is the null filter found in tools/filters/null.c but it does work and you can configure it. It’s also currently not hooked into the reload gear that we’ve recently done, so don’t expect it to work if you do frequently hot reloading.
Configuring a filter is fairly easy, take a look at this example:
First you can see that we setup the null filter with some arbitrary settings and point to where the .so file is. Filters can be configured with any arbitrarily nested data structure that can fit into a tnetstring, so you can pass them pretty much anything that matters. Lists, dicts, numbers, strings, are the main ones. You can also use variables in the config file, so you could create different servers and share config options for Filters and other parts of the files.
After that, there’s simply a Server.filters which takes a list of filters to load. If you don’t set this variable, then the filter gear isn’t even loaded and your server behaves as normal. If you do set this variable, then the filters are installed and will work.
If you run this config, you’ll see the filter printing out it’s config as a tnetstring, and then closing the connection, but only if you go to /nulltest/. If you go to /tests/sample.html to get at a directory, it’ll not even run.
We’ll have more documentation on actually writing filters in the Hacking section.