summaryrefslogtreecommitdiffstats
path: root/google_appengine/lib/django/docs/cache.txt
diff options
context:
space:
mode:
Diffstat (limited to 'google_appengine/lib/django/docs/cache.txt')
-rw-r--r--google_appengine/lib/django/docs/cache.txt543
1 files changed, 543 insertions, 0 deletions
diff --git a/google_appengine/lib/django/docs/cache.txt b/google_appengine/lib/django/docs/cache.txt
new file mode 100644
index 0000000..054d998
--- /dev/null
+++ b/google_appengine/lib/django/docs/cache.txt
@@ -0,0 +1,543 @@
+========================
+Django's cache framework
+========================
+
+A fundamental tradeoff in dynamic Web sites is, well, they're dynamic. Each
+time a user requests a page, the Web server makes all sorts of calculations --
+from database queries to template rendering to business logic -- to create the
+page that your site's visitor sees. This is a lot more expensive, from a
+processing-overhead perspective, than your standard read-a-file-off-the-filesystem
+server arrangement.
+
+For most Web applications, this overhead isn't a big deal. Most Web
+applications aren't washingtonpost.com or slashdot.org; they're simply small-
+to medium-sized sites with so-so traffic. But for medium- to high-traffic
+sites, it's essential to cut as much overhead as possible.
+
+That's where caching comes in.
+
+To cache something is to save the result of an expensive calculation so that
+you don't have to perform the calculation next time. Here's some pseudocode
+explaining how this would work for a dynamically generated Web page::
+
+ given a URL, try finding that page in the cache
+ if the page is in the cache:
+ return the cached page
+ else:
+ generate the page
+ save the generated page in the cache (for next time)
+ return the generated page
+
+Django comes with a robust cache system that lets you save dynamic pages so
+they don't have to be calculated for each request. For convenience, Django
+offers different levels of cache granularity: You can cache the output of
+specific views, you can cache only the pieces that are difficult to produce, or
+you can cache your entire site.
+
+Django also works well with "upstream" caches, such as Squid
+(http://www.squid-cache.org/) and browser-based caches. These are the types of
+caches that you don't directly control but to which you can provide hints (via
+HTTP headers) about which parts of your site should be cached, and how.
+
+Setting up the cache
+====================
+
+The cache system requires a small amount of setup. Namely, you have to tell it
+where your cached data should live -- whether in a database, on the filesystem
+or directly in memory. This is an important decision that affects your cache's
+performance; yes, some cache types are faster than others.
+
+Your cache preference goes in the ``CACHE_BACKEND`` setting in your settings
+file. Here's an explanation of all available values for CACHE_BACKEND.
+
+Memcached
+---------
+
+By far the fastest, most efficient type of cache available to Django, Memcached
+is an entirely memory-based cache framework originally developed to handle high
+loads at LiveJournal.com and subsequently open-sourced by Danga Interactive.
+It's used by sites such as Slashdot and Wikipedia to reduce database access and
+dramatically increase site performance.
+
+Memcached is available for free at http://danga.com/memcached/ . It runs as a
+daemon and is allotted a specified amount of RAM. All it does is provide an
+interface -- a *super-lightning-fast* interface -- for adding, retrieving and
+deleting arbitrary data in the cache. All data is stored directly in memory,
+so there's no overhead of database or filesystem usage.
+
+After installing Memcached itself, you'll need to install the Memcached Python
+bindings. They're in a single Python module, memcache.py, available at
+ftp://ftp.tummy.com/pub/python-memcached/ . If that URL is no longer valid,
+just go to the Memcached Web site (http://www.danga.com/memcached/) and get the
+Python bindings from the "Client APIs" section.
+
+To use Memcached with Django, set ``CACHE_BACKEND`` to
+``memcached://ip:port/``, where ``ip`` is the IP address of the Memcached
+daemon and ``port`` is the port on which Memcached is running.
+
+In this example, Memcached is running on localhost (127.0.0.1) port 11211::
+
+ CACHE_BACKEND = 'memcached://127.0.0.1:11211/'
+
+One excellent feature of Memcached is its ability to share cache over multiple
+servers. To take advantage of this feature, include all server addresses in
+``CACHE_BACKEND``, separated by semicolons. In this example, the cache is
+shared over Memcached instances running on IP address 172.19.26.240 and
+172.19.26.242, both on port 11211::
+
+ CACHE_BACKEND = 'memcached://172.19.26.240:11211;172.19.26.242:11211/'
+
+Memory-based caching has one disadvantage: Because the cached data is stored in
+memory, the data will be lost if your server crashes. Clearly, memory isn't
+intended for permanent data storage, so don't rely on memory-based caching as
+your only data storage. Actually, none of the Django caching backends should be
+used for permanent storage -- they're all intended to be solutions for caching,
+not storage -- but we point this out here because memory-based caching is
+particularly temporary.
+
+Database caching
+----------------
+
+To use a database table as your cache backend, first create a cache table in
+your database by running this command::
+
+ python manage.py createcachetable [cache_table_name]
+
+...where ``[cache_table_name]`` is the name of the database table to create.
+(This name can be whatever you want, as long as it's a valid table name that's
+not already being used in your database.) This command creates a single table
+in your database that is in the proper format that Django's database-cache
+system expects.
+
+Once you've created that database table, set your ``CACHE_BACKEND`` setting to
+``"db://tablename/"``, where ``tablename`` is the name of the database table.
+In this example, the cache table's name is ``my_cache_table``:
+
+ CACHE_BACKEND = 'db://my_cache_table'
+
+Database caching works best if you've got a fast, well-indexed database server.
+
+Filesystem caching
+------------------
+
+To store cached items on a filesystem, use the ``"file://"`` cache type for
+``CACHE_BACKEND``. For example, to store cached data in ``/var/tmp/django_cache``,
+use this setting::
+
+ CACHE_BACKEND = 'file:///var/tmp/django_cache'
+
+Note that there are three forward slashes toward the beginning of that example.
+The first two are for ``file://``, and the third is the first character of the
+directory path, ``/var/tmp/django_cache``.
+
+The directory path should be absolute -- that is, it should start at the root
+of your filesystem. It doesn't matter whether you put a slash at the end of the
+setting.
+
+Make sure the directory pointed-to by this setting exists and is readable and
+writable by the system user under which your Web server runs. Continuing the
+above example, if your server runs as the user ``apache``, make sure the
+directory ``/var/tmp/django_cache`` exists and is readable and writable by the
+user ``apache``.
+
+Local-memory caching
+--------------------
+
+If you want the speed advantages of in-memory caching but don't have the
+capability of running Memcached, consider the local-memory cache backend. This
+cache is multi-process and thread-safe. To use it, set ``CACHE_BACKEND`` to
+``"locmem:///"``. For example::
+
+ CACHE_BACKEND = 'locmem:///'
+
+Simple caching (for development)
+--------------------------------
+
+A simple, single-process memory cache is available as ``"simple:///"``. This
+merely saves cached data in-process, which means it should only be used in
+development or testing environments. For example::
+
+ CACHE_BACKEND = 'simple:///'
+
+Dummy caching (for development)
+-------------------------------
+
+Finally, Django comes with a "dummy" cache that doesn't actually cache -- it
+just implements the cache interface without doing anything.
+
+This is useful if you have a production site that uses heavy-duty caching in
+various places but a development/test environment on which you don't want to
+cache. In that case, set ``CACHE_BACKEND`` to ``"dummy:///"`` in the settings
+file for your development environment. As a result, your development
+environment won't use caching and your production environment still will.
+
+CACHE_BACKEND arguments
+-----------------------
+
+All caches may take arguments. They're given in query-string style on the
+``CACHE_BACKEND`` setting. Valid arguments are:
+
+ timeout
+ Default timeout, in seconds, to use for the cache. Defaults to 5
+ minutes (300 seconds).
+
+ max_entries
+ For the simple and database backends, the maximum number of entries
+ allowed in the cache before it is cleaned. Defaults to 300.
+
+ cull_percentage
+ The percentage of entries that are culled when max_entries is reached.
+ The actual percentage is 1/cull_percentage, so set cull_percentage=3 to
+ cull 1/3 of the entries when max_entries is reached.
+
+ A value of 0 for cull_percentage means that the entire cache will be
+ dumped when max_entries is reached. This makes culling *much* faster
+ at the expense of more cache misses.
+
+In this example, ``timeout`` is set to ``60``::
+
+ CACHE_BACKEND = "memcached://127.0.0.1:11211/?timeout=60"
+
+In this example, ``timeout`` is ``30`` and ``max_entries`` is ``400``::
+
+ CACHE_BACKEND = "memcached://127.0.0.1:11211/?timeout=30&max_entries=400"
+
+Invalid arguments are silently ignored, as are invalid values of known
+arguments.
+
+The per-site cache
+==================
+
+Once the cache is set up, the simplest way to use caching is to cache your
+entire site. Just add ``'django.middleware.cache.CacheMiddleware'`` to your
+``MIDDLEWARE_CLASSES`` setting, as in this example::
+
+ MIDDLEWARE_CLASSES = (
+ 'django.middleware.cache.CacheMiddleware',
+ 'django.middleware.common.CommonMiddleware',
+ )
+
+(The order of ``MIDDLEWARE_CLASSES`` matters. See "Order of MIDDLEWARE_CLASSES"
+below.)
+
+Then, add the following required settings to your Django settings file:
+
+* ``CACHE_MIDDLEWARE_SECONDS`` -- The number of seconds each page should be
+ cached.
+* ``CACHE_MIDDLEWARE_KEY_PREFIX`` -- If the cache is shared across multiple
+ sites using the same Django installation, set this to the name of the site,
+ or some other string that is unique to this Django instance, to prevent key
+ collisions. Use an empty string if you don't care.
+
+The cache middleware caches every page that doesn't have GET or POST
+parameters. Optionally, if the ``CACHE_MIDDLEWARE_ANONYMOUS_ONLY`` setting is
+``True``, only anonymous requests (i.e., not those made by a logged-in user)
+will be cached. This is a simple and effective way of disabling caching for any
+user-specific pages (include Django's admin interface). Note that if you use
+``CACHE_MIDDLEWARE_ANONYMOUS_ONLY``, you should make sure you've activated
+``AuthenticationMiddleware`` and that ``AuthenticationMiddleware`` appears
+before ``CacheMiddleware`` in your ``MIDDLEWARE_CLASSES``.
+
+Additionally, ``CacheMiddleware`` automatically sets a few headers in each
+``HttpResponse``:
+
+* Sets the ``Last-Modified`` header to the current date/time when a fresh
+ (uncached) version of the page is requested.
+* Sets the ``Expires`` header to the current date/time plus the defined
+ ``CACHE_MIDDLEWARE_SECONDS``.
+* Sets the ``Cache-Control`` header to give a max age for the page -- again,
+ from the ``CACHE_MIDDLEWARE_SECONDS`` setting.
+
+See the `middleware documentation`_ for more on middleware.
+
+.. _`middleware documentation`: ../middleware/
+
+The per-view cache
+==================
+
+A more granular way to use the caching framework is by caching the output of
+individual views. ``django.views.decorators.cache`` defines a ``cache_page``
+decorator that will automatically cache the view's response for you. It's easy
+to use::
+
+ from django.views.decorators.cache import cache_page
+
+ def slashdot_this(request):
+ ...
+
+ slashdot_this = cache_page(slashdot_this, 60 * 15)
+
+Or, using Python 2.4's decorator syntax::
+
+ @cache_page(60 * 15)
+ def slashdot_this(request):
+ ...
+
+``cache_page`` takes a single argument: the cache timeout, in seconds. In the
+above example, the result of the ``slashdot_this()`` view will be cached for 15
+minutes.
+
+The low-level cache API
+=======================
+
+Sometimes, however, caching an entire rendered page doesn't gain you very much.
+For example, you may find it's only necessary to cache the result of an
+intensive database query. In cases like this, you can use the low-level cache
+API to store objects in the cache with any level of granularity you like.
+
+The cache API is simple. The cache module, ``django.core.cache``, exports a
+``cache`` object that's automatically created from the ``CACHE_BACKEND``
+setting::
+
+ >>> from django.core.cache import cache
+
+The basic interface is ``set(key, value, timeout_seconds)`` and ``get(key)``::
+
+ >>> cache.set('my_key', 'hello, world!', 30)
+ >>> cache.get('my_key')
+ 'hello, world!'
+
+The ``timeout_seconds`` argument is optional and defaults to the ``timeout``
+argument in the ``CACHE_BACKEND`` setting (explained above).
+
+If the object doesn't exist in the cache, ``cache.get()`` returns ``None``::
+
+ >>> cache.get('some_other_key')
+ None
+
+ # Wait 30 seconds for 'my_key' to expire...
+
+ >>> cache.get('my_key')
+ None
+
+get() can take a ``default`` argument::
+
+ >>> cache.get('my_key', 'has expired')
+ 'has expired'
+
+There's also a get_many() interface that only hits the cache once. get_many()
+returns a dictionary with all the keys you asked for that actually exist in the
+cache (and haven't expired)::
+
+ >>> cache.set('a', 1)
+ >>> cache.set('b', 2)
+ >>> cache.set('c', 3)
+ >>> cache.get_many(['a', 'b', 'c'])
+ {'a': 1, 'b': 2, 'c': 3}
+
+Finally, you can delete keys explicitly with ``delete()``. This is an easy way
+of clearing the cache for a particular object::
+
+ >>> cache.delete('a')
+
+That's it. The cache has very few restrictions: You can cache any object that
+can be pickled safely, although keys must be strings.
+
+Upstream caches
+===============
+
+So far, this document has focused on caching your *own* data. But another type
+of caching is relevant to Web development, too: caching performed by "upstream"
+caches. These are systems that cache pages for users even before the request
+reaches your Web site.
+
+Here are a few examples of upstream caches:
+
+ * Your ISP may cache certain pages, so if you requested a page from
+ somedomain.com, your ISP would send you the page without having to access
+ somedomain.com directly.
+
+ * Your Django Web site may sit behind a Squid Web proxy
+ (http://www.squid-cache.org/) that caches pages for performance. In this
+ case, each request first would be handled by Squid, and it'd only be
+ passed to your application if needed.
+
+ * Your Web browser caches pages, too. If a Web page sends out the right
+ headers, your browser will use the local (cached) copy for subsequent
+ requests to that page.
+
+Upstream caching is a nice efficiency boost, but there's a danger to it:
+Many Web pages' contents differ based on authentication and a host of other
+variables, and cache systems that blindly save pages based purely on URLs could
+expose incorrect or sensitive data to subsequent visitors to those pages.
+
+For example, say you operate a Web e-mail system, and the contents of the
+"inbox" page obviously depend on which user is logged in. If an ISP blindly
+cached your site, then the first user who logged in through that ISP would have
+his user-specific inbox page cached for subsequent visitors to the site. That's
+not cool.
+
+Fortunately, HTTP provides a solution to this problem: A set of HTTP headers
+exist to instruct caching mechanisms to differ their cache contents depending
+on designated variables, and to tell caching mechanisms not to cache particular
+pages.
+
+Using Vary headers
+==================
+
+One of these headers is ``Vary``. It defines which request headers a cache
+mechanism should take into account when building its cache key. For example, if
+the contents of a Web page depend on a user's language preference, the page is
+said to "vary on language."
+
+By default, Django's cache system creates its cache keys using the requested
+path -- e.g., ``"/stories/2005/jun/23/bank_robbed/"``. This means every request
+to that URL will use the same cached version, regardless of user-agent
+differences such as cookies or language preferences.
+
+That's where ``Vary`` comes in.
+
+If your Django-powered page outputs different content based on some difference
+in request headers -- such as a cookie, or language, or user-agent -- you'll
+need to use the ``Vary`` header to tell caching mechanisms that the page output
+depends on those things.
+
+To do this in Django, use the convenient ``vary_on_headers`` view decorator,
+like so::
+
+ from django.views.decorators.vary import vary_on_headers
+
+ # Python 2.3 syntax.
+ def my_view(request):
+ ...
+ my_view = vary_on_headers(my_view, 'User-Agent')
+
+ # Python 2.4 decorator syntax.
+ @vary_on_headers('User-Agent')
+ def my_view(request):
+ ...
+
+In this case, a caching mechanism (such as Django's own cache middleware) will
+cache a separate version of the page for each unique user-agent.
+
+The advantage to using the ``vary_on_headers`` decorator rather than manually
+setting the ``Vary`` header (using something like
+``response['Vary'] = 'user-agent'``) is that the decorator adds to the ``Vary``
+header (which may already exist) rather than setting it from scratch.
+
+You can pass multiple headers to ``vary_on_headers()``::
+
+ @vary_on_headers('User-Agent', 'Cookie')
+ def my_view(request):
+ ...
+
+Because varying on cookie is such a common case, there's a ``vary_on_cookie``
+decorator. These two views are equivalent::
+
+ @vary_on_cookie
+ def my_view(request):
+ ...
+
+ @vary_on_headers('Cookie')
+ def my_view(request):
+ ...
+
+Also note that the headers you pass to ``vary_on_headers`` are not case
+sensitive. ``"User-Agent"`` is the same thing as ``"user-agent"``.
+
+You can also use a helper function, ``django.utils.cache.patch_vary_headers``,
+directly::
+
+ from django.utils.cache import patch_vary_headers
+ def my_view(request):
+ ...
+ response = render_to_response('template_name', context)
+ patch_vary_headers(response, ['Cookie'])
+ return response
+
+``patch_vary_headers`` takes an ``HttpResponse`` instance as its first argument
+and a list/tuple of header names as its second argument.
+
+For more on Vary headers, see the `official Vary spec`_.
+
+.. _`official Vary spec`: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.44
+
+Controlling cache: Using other headers
+======================================
+
+Another problem with caching is the privacy of data and the question of where
+data should be stored in a cascade of caches.
+
+A user usually faces two kinds of caches: his own browser cache (a private
+cache) and his provider's cache (a public cache). A public cache is used by
+multiple users and controlled by someone else. This poses problems with
+sensitive data: You don't want, say, your banking-account number stored in a
+public cache. So Web applications need a way to tell caches which data is
+private and which is public.
+
+The solution is to indicate a page's cache should be "private." To do this in
+Django, use the ``cache_control`` view decorator. Example::
+
+ from django.views.decorators.cache import cache_control
+ @cache_control(private=True)
+ def my_view(request):
+ ...
+
+This decorator takes care of sending out the appropriate HTTP header behind the
+scenes.
+
+There are a few other ways to control cache parameters. For example, HTTP
+allows applications to do the following:
+
+ * Define the maximum time a page should be cached.
+ * Specify whether a cache should always check for newer versions, only
+ delivering the cached content when there are no changes. (Some caches
+ might deliver cached content even if the server page changed -- simply
+ because the cache copy isn't yet expired.)
+
+In Django, use the ``cache_control`` view decorator to specify these cache
+parameters. In this example, ``cache_control`` tells caches to revalidate the
+cache on every access and to store cached versions for, at most, 3600 seconds::
+
+ from django.views.decorators.cache import cache_control
+ @cache_control(must_revalidate=True, max_age=3600)
+ def my_view(request):
+ ...
+
+Any valid ``Cache-Control`` HTTP directive is valid in ``cache_control()``.
+Here's a full list:
+
+ * ``public=True``
+ * ``private=True``
+ * ``no_cache=True``
+ * ``no_transform=True``
+ * ``must_revalidate=True``
+ * ``proxy_revalidate=True``
+ * ``max_age=num_seconds``
+ * ``s_maxage=num_seconds``
+
+For explanation of Cache-Control HTTP directives, see the `Cache-Control spec`_.
+
+(Note that the caching middleware already sets the cache header's max-age with
+the value of the ``CACHE_MIDDLEWARE_SETTINGS`` setting. If you use a custom
+``max_age`` in a ``cache_control`` decorator, the decorator will take
+precedence, and the header values will be merged correctly.)
+
+.. _`Cache-Control spec`: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9
+
+Other optimizations
+===================
+
+Django comes with a few other pieces of middleware that can help optimize your
+apps' performance:
+
+ * ``django.middleware.http.ConditionalGetMiddleware`` adds support for
+ conditional GET. This makes use of ``ETag`` and ``Last-Modified``
+ headers.
+
+ * ``django.middleware.gzip.GZipMiddleware`` compresses content for browsers
+ that understand gzip compression (all modern browsers).
+
+Order of MIDDLEWARE_CLASSES
+===========================
+
+If you use ``CacheMiddleware``, it's important to put it in the right place
+within the ``MIDDLEWARE_CLASSES`` setting, because the cache middleware needs
+to know which headers by which to vary the cache storage. Middleware always
+adds something the ``Vary`` response header when it can.
+
+Put the ``CacheMiddleware`` after any middlewares that might add something to
+the ``Vary`` header. The following middlewares do so:
+
+ * ``SessionMiddleware`` adds ``Cookie``
+ * ``GZipMiddleware`` adds ``Accept-Encoding``