Python Case Insensitive Glob

By Tyler on

Python 2’s glob by itself is limited in its pattern matching capability but when combined with regex it becomes more useful. Surprisingly, I couldn’t find anything concise and practical for selecting files with certain extensions in a case-insensitive manner so I wrote this:

#!/usr/bin/env python

import glob, re

for file in [f for f in glob.glob('*') if re.match('^.*\.zip$', f, flags=re.IGNORECASE)]:
    print file

Running Out of Memory With Too Many Apache VirtualHosts

By Tyler on

I’m running a VPS with 1GB of RAM that hosts around 20 low-traffic sites (LAMP stack) and I was having a problem where requests started taking longer and longer and MySQL would sporadically crash. I upgraded PHP and Apache, tweaked some MySQL settings, and installed monit to keep the MySQL process alive but it turned out these changes were treating the symptoms and not the cause. It was inexplicable because in the beginning every website was loading quickly and I hadn’t made any significant changes since the VPS was created.

Later, taking a look at the memory usage, I discovered it was always at or near 100% and using the swap. When the VPS got a burst of traffic I guess the swap was eaten up too and the MySQL process got killed. So that explains one piece of the puzzle but I still didn’t understand what was taking up so much memory in the first place. One of the things I tried was looking at the Apache access and error logs to see if there were any clues there. I didn’t see anything out of the ordinary and I cleared everything out to start fresh. Instantly memory usage dropped to 50% and the sites were loading instantly again. WTF?

Turns out other_vhosts_access.log was the bottleneck and clearing that solved the problem. Going forward I’ll either be rotating logs or disabling that particular log.

Postgres Immutable Concat in Indexes

By Tyler on

The volitility category of Postgres concat function is STABLE which means it’s guaranteed to return the same results given the same arguments for all rows within a single statement. Basically, you’re out of luck if you want to use this function in an index. Tom Lane explains in this post:

> The pg_catalog.concat() is defined as STABLE function.
> why was STABLE preferred for concat() over IMMUTABLE?

concat() invokes datatype output functions, which are not necessarily
immutable.  An easy example is that timestamptz_out's results depend
on the TimeZone setting.

In my case I’m only dealing with integer and text columns so I modified the original function (STABLE -> IMMUTABLE), creating immutable_concat.

CREATE OR REPLACE FUNCTION immutable_concat(VARIADIC "any")
  RETURNS text AS
'text_concat'
  LANGUAGE internal IMMUTABLE
  COST 1;