5.27.2009

Working with RPMs on CentOS

I'm pretty much going to just start spitting all my favorite rpm commands here. Assuming CentOS, but most of this is applicable to any rpm based distro.

List every rpm installed on your system, pipe it through grep to find something specific
rpm -qa

Info about a specific rpm, version number, build host an other useful info
[root@webapp01 ~]# rpm -qi php
Name : php Relocations: (not relocatable)
Version : 5.2.2 Vendor: (none)
Release : 3 Build Date: Thu 13 Mar 2008 04:31:00 PM PDT
Install Date: Fri 14 Mar 2008 11:02:12 AM PDT Build Host: dev.myhost.com
Group : Development/Languages Source RPM: php-5.2.2-3.src.rpm
Size : 12049373 License: The PHP License v3.01
Signature : (none)
URL : http://www.php.net/
Summary : The PHP HTML-embedded scripting language. (PHP: Hypertext Preprocessor)
Description :
PHP is an HTML-embedded scripting language. PHP attempts to make it
easy for developers to write dynamically generated webpages. PHP also
offers built-in database integration for several commercial and
non-commercial database management systems, so writing a
database-enabled webpage with PHP is fairly simple. The most common
use of PHP coding is probably as a replacement for CGI scripts.

The php package contains the module which adds support for the PHP


What rpm a specific file is from
rpm -qf /path/filename

List everything installed by an rpm
rpm -ql packagename

Turn any perl CPAN module into an rpm and install it
cpan2rpm -i Getopts::Long

After you've installed a src rpm, change into /usr/src/redhat/SPECS/ and execute this using your package's specfile and it builds an rpm and installs it. If you don't have a src rpm, but you've got a specfile for a source tarball, put the specfile in /usr/src/redhat/SPECS/ and the tarball in /usr/src/redhat/SOURCES .
rpmbuild -bi packname.spec

Creates an rpm repo in your current directory from the rpms in that directory
createrepo .

Searches all your rpm repos for a specific file
yum provides '*/rpmbuild'

List/Install/Remove package groups. Package groups are groups of rpms that are generally needed together, IE 'Development Libraries' or 'KDE (K Desktop Environment)'
yum grouplist / yum groupinstall / yum groupremove

Fun with Find

Recently I needed to clean up a bunch of temp files being create by a web application but, not being properly removed. I turned to find as a stop gap solution to this problem while the real problem was being solved. In my situation I knew that no file should survive longer than two hours in the directory where the files were being created. So using find I check the ctime on the files and remove them if it is more than 120 minutes old. Using find's -cmin option with +120 will give us files that have not been changed in the last 120 minutes, a negative number (-120) would give us files only files that have been changed in the last 120 minutes. Since I only want to delete files, and keep directories I also use the -type f option. If I simply run

shell> find /path/to/files -cmin +120 -type f

This will give me a list of all files that have not been changed in the last 120 minutes below the path I've given find. If I wanted to make sure find didn't list files in subdirectories I would add the option -maxdepth 1.

So now we have a list of the files we want to delete, using the -exec option will allow us to delete them. This is pretty much the same as piping the out put of find to xargs but it should use a little less overhead. With the -exec option foreach thing we've found with find the command we supply is executed. Find 25 items with find and your command is executed 25 times. Since we want the files we've found to be part of the command, we use '{}' as a variable. We also need to let find know when the command we are supplying is finished. You use a ';' for that, but since ';' has it's own meaning in bash, it needs to be escaped. So our option to remove all the files we've found without confirmation should looks like

-exec rm -f '{}' \;

If we used '+' instead of ';' find would try to string many of our files together and cause the command to be executed fewer times. This could be more efficient depending on the command we are running.

Now I needed to run this command every hour, so I created a little bash script to start this process up. I set it's nice level high so it's priority would be low, and setup a little checking to make sure we don't start this job up again if it hasn't finished from the last time it was started.

#!/bin/bash

# http://greg-techblog.blogspot.com
TMPPATH=/var/www/html/tmp/

# Check if cleantmp.cron is already running
if [ ! -f /var/run/cleantmp.pid ]; then
# Delete all files in tmp that are older than two hours
# run at lowest priority
nice -n 19 find $TMPPATH -nowarn -cmin +120 -type f -exec rm -f '{}' \; &
PID=$! # Get the pid of the find process
echo "$PID" > /var/run/cleantmp.pid # create a pidfile
wait $PID # wait for find to finish
rm -f /var/run/cleantmp.pid # remove pidfile
exit
fi

# cleantmp is already running
echo cleantmp.cron already running
cat /var/run/cleantmp.pid # print pid of find process

5.26.2009

Dealing with Memory Usage in Apache

Whatever amount of memory an Apache process uses to full fill a request, that process will continue to hold onto that amount of memory for the lifetime of that process. When you graceful Apache you cause all your current processes to end as soon as they are finished serving the requests they have already received, and then Apache creates a new set of processes. This typically frees up large blocks of memory used to serve old requests. Apache does this on it's own and it's controlled by the MaxRequestsPerChild directive. This is the number of requests a process should serve during it's lifetime. Making this number too low will cause Apache to tear down and create new processes quickly, increasing CPU overhead. Setting MaxRequestsPerChild too high will cause your machine to use more memory than it likely needs to, possibly causing your system to swap, or even run out of memory completely. Apache has no other means of garbage collection.

The default config for Apache on CentOS 5.3 has MaxRequestsPerChild set at 4000. I've seen people suggest that if your application is primarily serving dynamic content that a setting as low at 20 may make sense. I've opt'ed for 800 to 1000. It's a number I'm still playing with. Setting it this low has reduced the overall amount of memory my systems are using without increasing my CPU load significantly. It's something you'll need to experiement with.

5.22.2009

Centralize Syslogging Quick and Dirty

Lets start with that this shouldn't be done across a public network. If you need to do that, you should do basically the same thing as all this, except through stunnel, but I'm not going to get into how to setup all that right now. Basically what we are going to do is setup one machine to accept syslog messages from anywhere (remember, private network here, it would be a security risk to allow anyone to send you syslog messages). Then we configure another machine to send it's syslog messages to the first machine. Last, we'll setup cron jobs on both machines to make sure we'll still recieving messages from the remote server.

I'm using CentOS/Redhat for this, so this setup will differ a little on different distros, but not much.

Start with the machine we want to recieve syslog messages on. We'll call this machine foo and the machine we want to send messages from bar. Inventive... I know.

On foo we open up /etc/sysconfig/syslog and add the -r option

[root@foo ~]# vim /etc/sysconfig/syslog

# Options to syslogd
# -m 0 disables 'MARK' messages.
# -r enables logging from remote machines
# -x disables DNS lookups on messages recieved with -r
# See syslogd(8) for more details
SYSLOGD_OPTIONS="-m 0 -r"
# Options to klogd
# -2 prints all kernel oops messages twice; once for klogd to decode, and
# once for processing with 'ksymoops'
# -x disables all klogd processing of oops messages entirely
# See klogd(8) for more details
KLOGD_OPTIONS="-x"
#
SYSLOG_UMASK=077
# set this to a umask value to use for all log files as in umask(1).
# By default, all permissions are removed for "group" and "other".


Now restart syslog with

[root@foo ~]# service syslog restart
Shutting down kernel logger: [ OK ]
Shutting down system logger: [ OK ]
Starting system logger: [ OK ]
Starting kernel logger: [ OK ]

Next switch over to bar and open up the syslogd.conf file. Use @foo as the destination and send over whatever messages you'd like. For me, that's just going to be the stuff that normally ends up in /var/log/messages and /var/log/secure.

[root@bar ~]# vim /etc/syslog.conf

# Log all kernel messages to the console.
# Logging much else clutters up the screen.
#kern.* /dev/console

# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none;local0.none /var/log/messages
*.info;mail.none;authpriv.none;cron.none;local0.none @foo

# The authpriv file has restricted access.
authpriv.* /var/log/secure
authpriv.* @foo
# Log all the mail messages in one place.
mail.* -/var/log/maillog

# Log cron stuff
cron.* /var/log/cron

# Everybody gets emergency messages
*.emerg *

# Save news errors of level crit and higher in a special file.
uucp,news.crit /var/log/spooler

# Save boot messages also to boot.log
local7.* /var/log/boot.log

local0.* /var/log/squid.log


Now still on bar add foo to /etc/hosts.

# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost bar
192.168.1.101 foo


Then restart syslog on bar.

[root@bar ~]# service syslog restart
Shutting down kernel logger: [ OK ]
Shutting down system logger: [ OK ]
Starting system logger: [ OK ]
Starting kernel logger: [ OK ]

Now test it out. Use the command logger on bar.

[root@bar ~]# logger test

And then on foo, tail messages and see if you go it

[root@quagmire ~]# tail -1 /var/log/messages
May 22 17:34:29 bar root: test


Now for our little verification script. The way this will work is every hour we'll create a syslog message on bar and on foo we'll complain if it's been more than two hours since we got that message from bar.

On bar change director to /etc/cron.hourly/ and create a shell script called rollcall.sh

[root@bar cron.hourly]# vim rollcall.sh
#!/bin/bash
/usr/bin/logger checking in

Now chmod it executible.
[root@bar cron.hourly]# chmod +x rollcall.sh

On foo change into /usr/local/sbin/ create a perl script called rollcall_check.pl. This script requires Sys::SyslogMessages and IO::Capture::Stdout. Use cpan2rpm to install them if you don't already have both modules.

[root@foo sbin]# vim rollcall_check.pl
#!/usr/bin/perl -w
# http://greg-techblog.blogspot.com
# This is a dirty hack, but it gets the job done
use strict;

use Sys::SyslogMessages;
use IO::Capture::Stdout;

my $rollcall_conf="/usr/local/etc/rollcallhosts.conf";

open (HOSTS, "$rollcall_conf") || die("Can't open $rollcall_conf");
my @hosts_conf = ;
close HOSTS;
my %hosts = ();
map {chomp; my $key = $_; $hosts{$key} = 0;} @hosts_conf;
my $capture = IO::Capture::Stdout->new();
$capture->start();
my $messages = new Sys::SyslogMessages();
$messages->tail({'number_minutes' => '120'});
$capture->stop();
my @all_lines = $capture->read;
my @matches;
map {$_ =~ /\d\d:\d\d:\d\d\s(\w+).*checking in/; push @matches, ($1) if $1;} @all_lines;
foreach my $host (sort(keys(%hosts))){
map {$hosts{$host} = 1 if $_ eq $host} @matches;
print "$host did not report in.\n" if ! $hosts{$host};
}


Now chmod it executible.

[root@foo cron.hourly]# chmod +x
rollcall_check.pl

Create one more file in /usr/local/etc/ called rollcallhosts.conf. This will just be a list of hosts we expect to hear from. One hostname per line.

[root@foo etc]# vim rollcallhosts.conf
bar

Finally symlink
rollcall_check.pl in /etc/cron.hourly.

[root@foo cron.hourly]# ln -s /usr/local/sbin/rollcall_check.pl


Now our script will only complain if bar hasn't check in with foo at least once in the last two hours and root@foo should get an email about it. There is one little problem with this, and that's when logrotate on foo kicks in, you'll likely get a false message. Oh well, that shouldn't happen too often.

Extend a Mounted LVM EXT3 Partition

Prerequisites:

You need to have some extra unused space in your volume group. That's pretty much it.


Dump:


[root@webapp /]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
5.9G 4.5G 1.1G 81% /
/dev/xvda1 99M 29M 66M 31% /boot
tmpfs 3.0G 0 3.0G 0% /dev/shm

[root@webapp /]# pvscan
PV /dev/xvda2 VG VolGroup00 lvm2 [15.50 GB / 7.56 GB free]
Total: 1 [15.50 GB] / in use: 1 [15.50 GB] / in no VG: 0 [0 ]

[root@webapp /]# lvextend -L+3G /dev/VolGroup00/LogVol00
Extending logical volume LogVol00 to 9.00 GB
Logical volume LogVol00 successfully resized

[root@webapp /]# resize2fs /dev/VolGroup00/LogVol00
resize2fs 1.39 (29-May-2006)
Filesystem at /dev/VolGroup00/LogVol00 is mounted on /; on-line resizing required
Performing an on-line resize of /dev/VolGroup00/LogVol00 to 2359296 (4k) blocks.
The filesystem on /dev/VolGroup00/LogVol00 is now 2359296 blocks long.

[root@webapp /]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
8.8G 4.5G 3.9G 54% /
/dev/xvda1 99M 29M 66M 31% /boot
tmpfs 3.0G 0 3.0G 0% /dev/shm


Done.

Setup or Fix MySQL Replication Fast

So here is how to use LVM snapshots to setup or fix MySQL replication quickly.

Prerequisites:
First you need to have used LVM on the location of your MySQL store. I'm not walking you through that. And you need to have left some unused space in the volume group that your logical volume is in. If you don't know what any of that means, go read up on LVM.

Second, you're going to need to create an ssh key. Also not walking you through that. Setup an ssh key on your slave server, so that your master server can ssh to it as root without using a password.

Third, the script I've written assumes your MySQL store is in /var/lib/mysql. And I've written this for CentOS/Redhat, so I'm assuming
service mysql stop
Shuts down your mysql server. Both of these should be easy to fix for your installation.

Fourth, make sure skip_slave_start is in your my.cnf on your slave server.

Last, you'll need rsync, perl and the modules Linux::LVM and Getopt::Long. If you're using an rpm based distro, use cpan2rpm to install those modules. Everyone else, CPAN or your package manager.

The Plan

So what we are going to do is simple. We shutdown MySQL on the slave server. Then connect to your master database and get a list of all your databases, lock all the databases and get the master log position.
Make sure a replication user is setup. Then we create a snapshot, and unlock the master. Mount the snapshot and rsync the databases we got from the master to the slave. Unmount and remove the snapshot on the master. Start up the slave, set the log position and start up replication. Done.

The Script


#!/usr/bin/perl -w
# Script to fix or setup MySQL replication
# http://greg-techblog.blogspot.com

use strict;
use DBI qw{:sql_types};
use Linux::LVM;
use Getopt::Long;

# Database user. Assumes the slave user and pass on both slave and master
my $dbuser = q{USERNAME};
my $dbpass = q{PASSWORD};

# Should be an ip address, not a hostname
my $dbhost_master = q{192.168.1.2};
my $dbhost_slave = q{192.168.1.3};

# Replication user
my $replication_user = q{repl};
my $replication_pass = q{repl};

# Databases to skip when rsyncing the data director
my @skip = qw{information_schema};

# Name for our LVM snapshot
my $snapshot_name = q{dbbackup};

# Volume Group name where we are creating the snapshot
my $volgroup = q{VolGroup00};

# Logical Volume to snapshot
my $logvol = q{LogVol02};

# Location of ssh key
my $ssh_key = q{/root/.ssh/id_rsa};


my $dsn1 = qq|DBI:mysql:database=mysql;host=$dbhost_master;port=3306|;
my $dsn2 = qq|DBI:mysql:database=mysql;host=$dbhost_slave;port=3306|;
my $help;

GetOptions (
"master|m=s" => \$dbhost_master,
"slave|s=s" => \$dbhost_slave,
"ruser=s" => \$replication_user,
"rpass=s" => \$replication_pass,
"duser=s" => \$dbuser,
"dpass=s" => \$dbpass,
"key|k=s" => \$ssh_key,
"vol|v=s" => \$volgroup,
"log|l=s" => \$logvol,
"help|?|h" => \$help,
);

# Print help
if ($help) {
print "
$0 [options]\n
--master/-m IP address of master server (Default: $dbhost_master)
--slave/-s IP address of slave server (Default: $dbhost_slave)
--ruser Replication username (Default: $replication_user)
--rpass Replication password (Default: $replication_pass)
--duser Database admin user (Default: $dbuser)
--dpass Database admin password (Default: XXXXXX)
--key/-k Full path to ssh-key (Default: $ssh_key)
--vol/v Volume Group (Default: $volgroup)
--log/l Logical Volume (Default: $logvol)
--help/-?/-h This help\n
";
exit 0;
}

# Define a statement handle
my $sth;
# Check to see is snapshot volume exists already and die if it does
my %lvm = get_logical_volume_information($volgroup);
my @lvm = keys %lvm;
if (grep /\/$volgroup\/$snapshot_name$/, @lvm) {
die ("Snapshot volume already exists. Use lvmremove to remove it before running this command.\n");
}

# Shutdown Mysql on slave server
unless ( -e $ssh_key ) {
die ("Ssh key identity file missing: $ssh_key");
}
exe_cmd(
qq{ssh -i $ssh_key root\@$dbhost_slave "service mysql stop"},
qq{Couldn't ssh to slave and stop mysql}
);
print "Mysql on slave stopped\n";

#Connect to Master database
my $dbh = DBI->connect(
$dsn1,
$dbuser,
$dbpass,
{RaiseError => 0, AutoCommit => 1 }
) or die ("Error Connecting to server: ".DBI::errstr);
my $sth_dbs = $dbh->prepare(q{SHOW DATABASES});
$sth_dbs->execute or die("Problem executing query: ".$sth_dbs->errstr);

# Lock all tables on master database
$sth = $dbh->prepare(q{FLUSH TABLES WITH READ LOCK});
$sth->execute or die("Probelm executing query: ".$sth->errstr);
print "All Databases on master locked\n";

# Get the master log position and file
$sth = $dbh->prepare(q{SHOW MASTER STATUS});
unless ($sth->execute) {
my $err_msg = $sth->errstr;
my $sth_unlock = $dbh->prepare(q{UNLOCK TABLES});
$sth_unlock->execute;
die("Probelm executing query: $err_msg\n".$sth_unlock->errstr);
}
my $master_status;
unless ($master_status = $sth->fetchrow_hashref) {
my $err_msg = $sth->errstr;
my $sth_unlock = $dbh->prepare(q{UNLOCK TABLES});
$sth_unlock->execute;
die("Probelm executing query: $err_msg\n".$sth_unlock->errstr);
}

# Create the snapshot
my $lvcreate_msg = `lvcreate -L1G -s -n $snapshot_name \
/dev/$volgroup/$logvol 2>&1`
;
if ( $? ) {
my $sth_unlock = $dbh->prepare(q{UNLOCK TABLES});
$sth_unlock->execute;
die("Couldn't create snapshot: $lvcreate_msg $?\n".$sth_unlock->errstr);
}
print "Snapshot created\n";

# Unlock all tables
$sth = $dbh->prepare(q{UNLOCK TABLES});
$sth->execute or die("Probelm executing query: ".$sth->errstr);
print "All databases on master unlocked\n";

# Grant replication rights on master
$sth = $dbh->prepare(q{GRANT REPLICATION SLAVE ON *.* TO ?@? IDENTIFIED BY ?});
$sth->execute($replication_user,$dbhost_slave,$replication_pass)
or die("Problem executing query: ".$sth->errstr);

# Mount the snapshot
unless ( -d qq{/mnt/$snapshot_name}) {
mkdir qq{/mnt/$snapshot_name} or die ("Couldn't create mount point directory: $?");
}
my $mount_snapshot_msg = `mount /dev/$volgroup/$snapshot_name \
/mnt/$snapshot_name -onouuid,ro 2>&1`
;
die ("Couldn't mount snapshot: $mount_snapshot_msg $?") if $?;

# Start rsyncing the snapshot to the slave
print "Starting rsync\n";
my $out;
while ( my ($db) = $sth_dbs->fetchrow_array ) {
unless (grep /^$db$/, @skip) {
$out .= `rsync -zrav /mnt/$snapshot_name/lib/mysql/$db/ \
$dbhost_slave:/var/lib/mysql/$db/ 2>&1`
;
}
die ("rsync failed: $out $?") if $?;
}

# Diconnect from Master Database
$dbh->disconnect;

# Get rid of snapshot
print $out;
exe_cmd(
qq{umount /mnt/$snapshot_name},
qq{Couldn't umount snapshot}
);
exe_cmd(
qq{lvremove -f /dev/$volgroup/$snapshot_name},
qq{Couldn't remove snapshot volume}
);
print "Snapshot removed\n";

# Start Mysql back up on slave (skip_slave_start better be in the my.cnf)
exe_cmd(
qq{ssh -i $ssh_key root\@$dbhost_slave "service mysql start"},
qq{Couldn't ssh to slave and start mysql}
);
print "Mysql on slave started\n";

# Connect to slave and setup replication
$dbh = DBI->connect( $dsn2, $dbuser, $dbpass, { RaiseError => 0, AutoCommit => 1 } )
or die ("Error Connecting to server: ".DBI::errstr);
my $change_master_query = q{
CHANGE MASTER TO MASTER_HOST = ?,
MASTER_USER = ?,
MASTER_PASSWORD = ?,
MASTER_LOG_FILE = ?,
MASTER_LOG_POS = ?
};
$sth = $dbh->prepare($change_master_query) or die ("Problem preparing query\n $change_master_query\n".DBI::errstr);
# Make MASTER_LOG_POS an INT
$sth->bind_param(5,1,SQL_INTEGER);
$sth->execute(
$dbhost_master,
$replication_user,
$replication_pass,
$master_status->{'File'},
int($master_status->{'Position'})
) or die ("Problem changing master:".$sth->errstr);

# Slave slave
$sth = $dbh->prepare(q{START SLAVE});
$sth->execute or die ("Problem starting slave:".$sth->errstr);

# Check to see if everything worked
print "Sleep for 5 seconds...\n";
sleep 5; # this may need to be longer
$sth = $dbh->prepare(q{SHOW SLAVE STATUS});
$sth->execute;
my $slave_status = $sth->fetchall_arrayref({});
print "Slave_SQL_Running:\t".$slave_status->[0]->{'Slave_SQL_Running'}."\n";
print "Slave_IO_Running:\t".$slave_status->[0]->{'Slave_IO_Running'}."\n";
print "Slave_IO_State:\t".$slave_status->[0]->{'Slave_IO_State'}."\n";
print "Last_IO_Error:\t".$slave_status->[0]->{'Last_IO_Error'}."\n";
print "Seconds_Behind_Master:\t".$slave_status->[0]->{'Seconds_Behind_Master'}."\n";

sub exe_cmd {
my $cmd = shift;
my $fail_msg = shift;
my $msg = `$cmd 2>&1`;
chomp $msg;
die ("$fail_msg : $msg\n$?") if $?;
}
__END__

OSX Time Machine Over Your Network Without a Timecapsule

So I spent the other day figuring out how to do this, and then when I went to price buying a NAS, timecapsules were going for $200, so I bought one. I'm generally happy with it. Thinking I had wasted my time figuring out how to make timemachine work with an unsupported network device, my co-worker mentioned to me today that he needed to get this working for our Macs at work. Ha! I said, I'll get to use this bit of knowelege after all. So here it is.

Start by connecting to your network storage through either SMB or AFP. Create a file called

.com.apple.timemachine.supported

Next you're going to need your machine name and the MAC address of your network card.

Drop to a terminal to get both of these. You machine name should be your host name, or the first part of your command prompt, the bit before the colon.

Laptop:~ greg$

Next now, execute
ifconfig

Locate the line that says
ether
under the network interface card. That's your MAC address.

en0: flags=8863 mtu 1500
ether 00:13:db:9d:62:fc


Now take that machine name and MAC address minus the colons to create the name of the disk image you'll need to create to make this thing work. Put them together seperated by an underscore.

Laptop_0013db9d62fc

Now you'll need to create a disk image. Use this command, replacing your image name with mine and add .sparsebundle to the end of it.


Laptop:~ greg$ hdiutil create -library SPUD -megabytes 204800 -fs HFS+J \
-type SPARSEBUNDLE -volname "Laptop_0013db9d62fc.sparsebundle" \
"Laptop_0013db9d62fc.sparsebundle
"



You should also set the size of your disk image about twice the size of the drive you are backing up. After the disk image is finished being created, drop it on your network storage device.

One last step and you're good to go. Execute this at the terminal.

defaults write com.apple.systempreferences\
TMShowUnsupportedNetworkVolumes 1


Now connect to your network storage device and open up timemachine, hit
Change Disk
and your network storage device should show up as an available device.

Of course this is all subject to change. I read through a number of
how-to postings on this written a different times, and it seems apple
keeps making this process more and more complicated, but as of the
writing of this posting, this process works.