mod_python

Logging Apache Subrequests with mod_python

Recently at work, I've been working on writing a Python-based Subversion authorizer. Instead of using mod_authz_svn, which requires your groups and authorization rules be in a text file, we are using something that fits into our product's needs. Anyways, as I was doing this, I really needed a good way to log Subversion's request cycle. Your first thought would be to look at Apache access logs. Well, Subversion's mod_dav_svn uses Apache subrequests, when enabled of course, to communicate and since Apache doesn't log subrequests, I needed something custom.

In a previous blog entry, Using mod_python for Custom Apache/Subversion Authentication/Authorization, I discussed a few reasons why using mod_python was an excellent choice for writing Apache server extensions. That being said, I chose mod_python again for creating a way to log Apache's subrequests.

I will be using my need for logging the Subversion request cycle as the basis for the example code and Apache configuration you see. When using for your own benefit, you'll need to make changes to fit your environment. The good thing is that the example code is well documented so this should be easy. That being said, let's see how we can make mod_python get involved in our Apache communication:

  1. ...
  2. LoadModule python_module libexec/apache2/mod_python.so # Your path might be different
  3. ...
  4. <Location /repos>
  5. #Subversion configuration
  6. ...
  7. # mod_python configuration
  8. SetHandler mod_python
  9.  
  10. # The interpreter to use
  11. PythonInterpreter main_interpreter
  12.  
  13. # The mod_python handlers and their module/script that contains the mod_python handler functions
  14. PythonLogHandler svn_logger
  15.  
  16. # Modify the sys.path to be able to locate your module/script
  17. PythonPath "['/opt/svn'] + sys.path"
  18. ...
  19. </Location>
  20. ...

The snippet above shows how I took an existing Subversion configuration and put the mod_python bits in there to help out with this venture. The Apache directives for mod_python are pretty straight forward but if you need more details, please refer to the mod_python documentation. Now on to the actual script used to log the Apache requests and subrequests for Subversion.

  1. #!/usr/bin/env python
  2. #
  3. # -*- python -*-
  4. #
  5. # Simple script that uses mod_python to log the Subversion client requests,
  6. # including the subrequests made on the server side.
  7.  
  8. import os, sys
  9.  
  10. def loghandler(req):
  11. """ mod_python handler method taking an Apache request object. """
  12. req_type = 'main'
  13.  
  14. if req.main:
  15. req_type = 'subrequest'
  16.  
  17. log = open('/tmp/svn_requests.log', 'a')
  18.  
  19. log.write("[%s] %s->%s\n" % (req_type, req.method, req.uri))
  20.  
  21. log.close()
  22.  
  23. return apache.OK
  24.  
  25. # handler()

That is it. As you can see, the name of the function that you need to implement for mod_python to be happy is the name of the handler but all lower case and without the "Python" in front. (Note: mod_python has a few different ways to specify a handler, even allowing you to specify the method to be called eliminating the need to conform to standards if you want to be such a rebel.) Once we got mod_python's requirements out of the way, the code to log the requests was quite simple. In a production environment, where you don't want this to crash the application if there is a problem, you could easy use a try/except block to catch any errors while opening and writing to the file. Just make sure to return "apache.OK" to let Apache continue doing its thing.

I hope this simple example of using mod_python to log Apache subrequests was enlightening. I know it has helped me tremendously.

Python: 

Using mod_python for Custom Apache/Subversion Authentication/Authorization

As most are aware, Apache is a very modular, highly developer-friendly web server. It still serves more web content worldwide than any other web server. The problem for some is that it is written in C and for you to write custom functionality for Apache, this usually means writing an Apache module in C. What you probably didn't know is that there is an easier way and it is due to mod_python.

mod_python is "an Apache module that embeds the Python interpreter within the server". That is kind of vague, and it's even suggested on the mod_python homepage to read the "Introducing mod_python" article from O'Reilly. To save you the trouble, the important parts in the context we're discussing, mod_python provides the following

  • A handler to the Apache request processing phaes, like authentication (authn) and authorization (authz)
  • An interface to a subset of the Apache API, which means you can call internal Apache functions from Python

Having access to the internal Apache API, or at least some of it, and being able to handle the Apache phases are the parts we're interested in, since a few of those phases are devoted to authn/authz. Before going into any examples or details on how to use these features for creating your own authn/authz module using mod_python, let's go through a very simple series of steps for hooking mod_python into Apache for your application. (We will not be talking about how to install mod_python since there is a plethora of information about this online.) For this example, we'll get talking about how you could use mod_python to create your own custom authn/authz for Subversion.
Hook mod_python into Apache
As usual, there are many ways to do this but since our example is for authentication/authorizing Subversion access, we'll be putting our mod_python stuff within a standard "Location" block. Here is a simple Location block that we will be starting with:

  1. # Work around authz and SVNListParentPath issue
  2. RedirectMatch ^(/repos)$ $1/
  3.  
  4. # Enable Subversion logging
  5. CustomLog logs/svn_logfile "%t %u %{SVN-ACTION}e" env=SVN-ACTION
  6.  
  7. <Location /repos/>
  8. # Enable Subversion
  9. DAV svn
  10.  
  11. # Directory containing all repository for this path
  12. SVNParentPath /opt/repos/svn
  13.  
  14. # Enable repository listing when browing the Location root
  15. SVNListParentPath On
  16.  
  17. # Enable WebDAV automatic versioning
  18. SVNAutoversioning On
  19.  
  20. # Repository Display Name when browsing with the built-in repository browser
  21. SVNReposName "Your Subversion Repository""
  22. </Location>

The Location block above is one of the simplest ways to expose a Subversion repository. As you can see there is no authn/authz stuff in here meaning right now, if you exposed your repositories using this Location block, you'd have free roam to do whatever you wanted to the repositories and their contents. Now...let's update our Location block to have the necessary bits to require Apache authentication and to make our Python script the handler of the authn/authz Apache phases.
  1. # Work around authz and SVNListParentPath issue
  2. RedirectMatch ^(/repos)$ $1/
  3.  
  4. # Enable Subversion logging
  5. CustomLog logs/svn_logfile "%t %u %{SVN-ACTION}e" env=SVN-ACTION
  6.  
  7. <Location /repos/>
  8. # Enable Subversion
  9. DAV svn
  10.  
  11. # Directory containing all repository for this path
  12. SVNParentPath /opt/repos/svn
  13.  
  14. # Enable repository listing when browing the Location root
  15. SVNListParentPath On
  16.  
  17. # Enable WebDAV automatic versioning
  18. SVNAutoversioning On
  19.  
  20. # Repository Display Name when browsing with the built-in repository browser
  21. SVNReposName "Your Subversion Repository""
  22.  
  23. # Do basic password authentication in the clear
  24. AuthType Basic
  25.  
  26. # The name of the protected area or "realm"
  27. AuthName "Your Subversion Repository"
  28.  
  29. # Require authentication
  30. Require valid-user
  31.  
  32. # Make Apache aware that we want to use mod_python
  33. AddHandler mod_python .py
  34.  
  35. # Specify the callable object to handle the authn phase
  36. PythonAuthenHandler svnauth
  37.  
  38. # Specify the callable object to handle the authz phase
  39. PythonAuthzHandler svnauth
  40.  
  41. # Optionally extend the Python path to locate your callable object
  42. PythonPath "sys.path+['/opt/repos/scripts']"
  43. </Location>

To better understand the options for the mod_python directives in the Location block, here are a few direct links within the mod_python documentation:

Alright...now that we have Apache ready to use mod_python, we need to create the script used for auth/authz. Since the PythonAuthenHandler and PythonAuthzHandler just specify a module name, we need to have a file named svnauth.py somewhere on the Python path. We can put it into /opt/repos/scripts since we used the PythonPath directive to updated the Python path to include that directory. Now on to the file's contents.
Since we just specified a module name in the mod_python handler directives, we need to follow a specific naming convention for our handler function names. The convention is that you basically strip off the "Python" part of the handler directive's name, and create a function in your module with that name, all lower case of course, that accepts one argument, the Apache request object. Here is a very simple example of our svnauth.py:

  1. #!/usr/bin/env python
  2. # -*-python-*-
  3. #
  4. # Simple example of using mod_python to handle
  5. # Apache's authn and authz phases.
  6.  
  7. from mod_python import apache
  8.  
  9. def authenhandler(req):
  10. """This function gets called by mod_python to handle Apache's authentication phase"""
  11.  
  12. return apache.OK
  13.  
  14. def authzhandler(req):
  15. """This function gets called by mod_python to handle Apache's authorization phase"""
  16.  
  17. return apache.OK

Pretty simple huh? As the functions stand now, they just return "apache.OK" which instructs Apache that the requested user is authentication and authorized to access the repositories. So...now that you know how to hook up mod_python to handle the Apache phases, like the authentication and authorization phases, where do you go from here? Well, that is up to you. This "article" is only to help get you started. But with me being a nice guy, here are some examples of how you might use this knowledge not only for Subversion but for any Apache-based application where you need to use your own authentication/authorization measures:

  • Making web service calls to authenticate/authorize a user
  • Implementing single sign on for your application
  • Whenever your company stores authn/authz information in a way that is not accessible by conventional Apache modules, like storing your Subversion authz information in a database instead of the typical authz file

The possibilities for using mod_python for authn/authz are endless. Honestly, the possibilities for using mod_python for any of the Apache phases are endless. Sure beats writing C-based Apache modules when it comes to simplicity.
As for what we've learned, I hope that I've taught you enough about mod_python and how it can be used with Apache for creating your own pseudo Apache modules. The example I gave is a real-world example where you can use mod_python to implement your own authn/authz for Subversion but that is just an example. Anywhere Apache is used in your infrastructure, mod_python can be used to make hooking into Apache simple and even provide you the ability to do things that the built-in Apache modules don't provide.