Technology


A slug is a SEO-friendly, human-readable version of a URL. Generally used on most blog software for permalinks via a blog’s title (exactly like my blog here), or basically any string you want to turn into a friendly URL. Sure you could just use PHP’s urlencode (or other language equivalent) but then you’re stuck with unfriendly characters translated into hex codes: %2F%20

The problem is greater when the content you want to Slug is UTF-8 encoded and contains non-ASCII characters. How do you slug a word like: Iñtërnâtiônàlizætiøn?

My ongoing redo of Footstops, which now creates slug’d URLs from UTF-8 user generated content, has ventured me into such territory and I’ll share my slug method with you. The one caveat is that its power relies on the awesome iconv library, which has come enabled by default since PHP 5.0.0, and easily installable in PHP 4.2+, so make sure you have that, if not, remove the line – it still works, just not nearly as well. I also make the assumption that your data is encoded in UTF-8, which is fairly safe because it is pretty backward compatible, but if you are working in a different charset, please adjust as necessary.

The method is short and sweet.

1. First we use iconv to translit the UTF-8 string into ASCII. This converts the UTF-8 string into an ASCII equivalent, but also translate non-ASCII characters into their ASCII appearing equivalents: ë becomes e.

iconv('UTF-8', 'ASCII//TRANSLIT//IGNORE', $string);

2. We then remove all the unwanted characters from the URL.

preg_replace('/[^a-zA-Z0-9 -]/', '', $url);

3. We convert it to lowercase (which is just a preference for consistency), make sure its between our max string length (we don’t want a 64 character slug, 40-50 characters is probably lots), and remove any surrounding whitespace.

trim(substr(strtolower($url), 0, $maxLength));

4. Finally we replace any whitespace or our separator character with a single instance of the separator character, to remove multiples. I prefer an underscore as a word separator rather than a dash (traditional slug separator) as it may conflict with an actual hyphen in the string but in the final version you’ll see it’s easy to default to your own preference.

preg_replace('/[s' . $separator . ']+/', $separator, $url);

So we put it all together with a couple of options:

public static function ToSlug($string, $maxLength=40, $separator='_') {
	$url = iconv('UTF-8', 'ASCII//TRANSLIT//IGNORE', $string);
	$url = preg_replace('/[^a-zA-Z0-9 -]/', '', $url);
	$url = trim(substr(strtolower($url), 0, $maxLength));
	$url = preg_replace('/[s' . $separator . ']+/', $separator, $url);
	return $url;
}

Calling ToSlug('Iñtërnâtiônàlizætiøn    is the greatest! ') produces the slug:

internationalizaetion_is_the_greatest

perfect for a friendly URL!

I was perusing the Ubuntu Hardy Heron backports repository (what? You don’t do that?!) and noticed that subversion had been added to the backports. If you’re reading this, you probably already know that you’re stuck with subversion 1.4 officially and unless you wanted to compile from source, or install a 3rd party .deb, you were pretty much out of luck.

Now that its been added to the backports repository, its a very simple to upgrade.

If you’re never added anything from backports before, listen up. Backports are all the libs that have feature upgrades (not security related) that won’t show up in normal apt-get upgrade repositories. You could simply add the following line to your sources.list:

deb http://archive.ubuntu.com/ubuntu hardy-backports main universe multiverse restricted

On a default Hardy install it should already be in there, just commented out. So you can uncomment the two backports lines.

This will make all the libs on your machine upgrade to the newest… probably not what you want as you won’t get security updates for the backported versions. Instead we can use something called Pinning, which is available on most apt based operating system.

What we do is add the repository to our sources.list as above but put its priority lower then default, so that the normal repositories take precedent. Then we can on, a per lib basis, upgrade only the libraries we expressly want upgraded from backports.

In the file /etc/apt/preferences (you may have to create it), add the following lines:

Package: *
Pin: release a=hardy-backports
Pin-Priority: 400

We’re setting the hardy-backports repository to have a lower then default priority, so it won’t override the official repositories.

Update your apt cache: sudo apt-get update

Now if you do an apt-get upgrade subversion you shouldn’t see any changes… so how do we install from backports?

sudo apt-get upgrade -t hardy-backports subversion

You can see, its the same line, except we set the target of the upgrade to come from the backports repository.

Now you can keep your servers on the main repositories but still update a library here and there if you really desire new functionality.

I was doing some system maintenance today and came across the following horrific screen:

/dev/md0:
        Version : 00.90.03
  Creation Time : Sun Nov 16 14:13:20 2007
     Raid Level : raid5
     Array Size : 732587712 (698.65 GiB 750.17 GB)
  Used Dev Size : 244195904 (232.88 GiB 250.06 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Wed Dec 31 10:41:15 2008
          State : clean, degraded
 Active Devices : 3
Working Devices : 3
 Failed Devices : 1
  Spare Devices : 0

...

    Number   Major   Minor   RaidDevice State
       0       0        0        0      removed
       1       8       17        1      active sync   /dev/sdb1
       2       8       33        2      active sync   /dev/sdc1
       3       8       49        3      active sync   /dev/sdd1
       4       8        1        -      faulty spare

One of the drives in my fileserver had died! Time to back up and get that sucker running again.

Please note: The following is only a guide to help you replace a failed disc. I cannot guarantee this will work for you, but it is what I do and has worked every time without any data loss.

As you can see, it is a 4 disc software RAID 5 array with no hotswap spares. The following should work for most single disc failure situations in RAID 1, 5, or 6.

It appears that sda has bailed on me. First things first, backup the machine. If anything happens, you can rebuild from scratch.

You can see the faulty disc has already been removed from the array, but if yours hasn’t been removed yet, the commands:

mdadm --manage /dev/md0 -f /dev/sda1
mdadm --manage /dev/md0 -r /dev/sda1

will mark it as failed (so it can be removed) and remove the sda1 partition.

Shutdown the machine and switch out the harddrives, make sure you only replace the faulty drive, don’t mess up the order of the drives cause it’ll be a pain to get it back in.

Boot up the machine. Your RAID array will be in the same degraded state. We need to partition the new drive exactly the same way we partitioned the drives in the existing array. Luckily this is a one-liner with sfdisk:

sfdisk -d /dev/sdb | sfdisk /dev/sda

The above code will dump the partition table of sdb (or use any of the functioning drives) and pipe it to sfdisk to partition sda the same way. It should only take a second.

Then we can simply add the new drive to the array:

mdadm --manage /dev/md0 -a /dev/sda1

Bam! If you take a look at cat /proc/mdstat or mdadm --detail /dev/md0 you should see that the array is recovering (with a percentage done).

After the recovery is done, you’ll be back to new and clean!

Good luck!

Wow! We’ve recently moved to Visual Studio 2008 at work and I’ve been moving one of our larger products to resource files and multi-language support. In doing so, you have to look at the Design view (or Split View) in the IDE to be able to generate a local resource file.

Every time I go to Design View and generate the file, all my beautifully formatted markup is destroyed by Visual Studio. This has happened ever since I started using Visual Studios and I succumbed to the inevitability of formatting failure…

Until today…

Fed up with the formatting, after fruitless online searching, I found the answer myself very quickly and I’ve been kicking myself since.

  1. Head to the top menu and select Tools
  2. Choose Options
  3. Expand Text Editor
  4. Expand HTML
  5. Choose Format
  6. Uncheck Wrap tags when exceeding ...

Easy! You might want to stay in that screen and fix any capitalization or auto insert close tag issues you may have.

I’ve finally got around to finishing our new development project space. You can find the open source portion at: http://projects.brybot.ca or by using the Software link from the top menu.

The development space is a major step forward in whats happening around Brybot as well as the Alkaloid Networks. We’ve setup brand new open source and closed source repositories, tickets trackers, and wikis to make all our projects in the future easier to use, manage, and maintain… we’re all very excited.

As everything is moved over to our new system, expect lots of updates on some of our upcoming projects and new websites we’re launching.

I’ve been using MAMP on my macbook for doing my web development at home and can’t say enough good things about it. Unfortunately I had to start using some binaries not included with MAMP, ImageMagick for example.

Update: Commentors have been reporting this fix is not working on Snow Leopard or MAMP 1.8.4

Installing ImageMagick via MacPorts is simple enough:

sudo port install ImageMagick

but trying to run it via the MAMP stack gives you the apache error:

dyld: Symbol not found: __cg_jpeg_resync_to_restart
  Referenced from: /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/ImageIO.framework/Versions/A/ImageIO
  Expected in: /Applications/MAMP/Library/lib/libjpeg.62.dylib

or something similar. I was able to get it to go, but you need to break the sandbox, so it may not be a viable solution for everyone.

Once ImageMagick is installed, head to:

/Applications/MAMP/Library/bin/envvars

Here we want to comment out the two lines:

#DYLD_LIBRARY_PATH="/Applications/MAMP/Library/lib:$DYLD_LIBRARY_PATH"
#export DYLD_LIBRARY_PATH

and add the line:

export PATH="$PATH:/opt/local/bin"

What we’re doing is using our systems default DYLD Library path, not the sandboxed one, and exporting our MacPorts bin directory into the sandbox PATH, so it can execute your ImageMagick bins like convert without specifying the full path to it.

Presto! I’m not sure of the full consequences of doing this but I haven’t run into any problems yet and nothing seems to break as a result. Always backup the original files tho!

I’ve had tons of people ask me how to Select DISTINCT rows from a DataSet. Why not use SQL? Well sometimes you just can’t, and sometimes its much more efficient to do it webserver side then database side.

For some reason Microsoft has an article on how to write your own helper class, which is fine and dandy, but you don’t need to bother. While this implementation isn’t exactly the same as a SQL call, its very simple and works extremely well when working with medium to small datasets.

Say we have a DataSet ds like so:

recordID groupID value
1 100 abc
2 100 def
3 220 ghi
4 333 jkl

So to select distinct groupIDs we simply:

DataTable distinctDT = ds.Tables[0].DefaultView.ToTable(true, new string [] { "groupID" });

This returns:

groupID
100
220
333

We can now use this distinct DataTable to make our queries on our original DataSet:

ds.Tables[0].Select("groupID = " + distinctDT.Rows[0]["groupID"].toString());

That looks a little more complicated then it actually is… oh well, its actually quite a nice solution when you can’t use SQL to do your DISTINCT selects for you because you can use the same DataSet over and over reducing time to your SQL Server or FileIO.

I often have the need to get the decimal degrees of a longitude and latitude and generally use Google Maps (and Earth) to find the location, but there is no easy way to get the decimal degrees for a coordinate – you only get Degrees/Minutes/Seconds.

I was messing around with Google Maps and the Mapping API the other day and found a simple way to get decimal degrees (without parsing them out of the “Link Here” url).

Simply centre the Google Map on the spot you’re interested in (rightclick -> Centre Map here) and enter this as the browser’s URL:

javascript:void(prompt("", gApplication.getMap().getCenter()));
Nexenta

Nexenta

My personal servers have been running Nexenta NCP, a hybrid OS running the Solaris kernel (and thus ZFS) and the GNU userland (think Ubuntu), for a couple months now. As of this weekend/week, we’ll be moving the Alkaloid Canada infrastructure over to Nexenta as well, basically for ZFS. I believe the eventual plan is to move Alkaloid US in Atlanta over, but that may take a lot longer. Us Canadians are always the guinea pigs…

So if things stop working in the next week, it’ll be because of growing pains. Wish us luck!

StopMiningTibet.com petition is available.

StopMiningTibet.com petition is available.

I finished putting up a simple moratorium petition for Students for a Free Tibet’s Stop Mining Tibet campaign.

Mining corporations around the globe, including Canada, are contracting with China to mine the rich Tibetan land of their resources with little compensation going to the Tibetan people. Sign the petition to help put a stop to this great injustice!

“There shall be no mining in Tibet until the Tibetan people can, for their own ends, and with full free, prior and informed consent, control and make decisions about the extraction and disposal of their mineral wealth using the highest standards of participatory governance and ecological management.”

http://www.stopminingtibet.com/moratorium.php

Be sure to visit the site!