7.09.2009

Compare Two Lists of Unique Items in Perl ( The Gregorian Join )

Let's say you wanted to compare two lists of unique items and see what's in list A and not B, what's in B and not A, and what's in both. Say for example you had a list of all the rpms installed on two different machines ( the output of the command rpm -qa). Well, I can't say if this is the best way or not, but this works, and I think it looks pretty cool.

my (%comp,@a_only,@b_only,@both,$pushref);
map $comp{$_} = 2, @a;
map $comp{$_}++, @b;
map {
$pushref = $comp{$_} == 3 ? \@both
: $comp{$_} == 2 ? \@a_only
: \@b_only;
push(@$pushref,$_);
} keys %comp;

7.02.2009

LWP: Test Posting Form Data

LWP::UserAgent is perl's web browser. If you need to GET,POST or PUT, LWP is your friend. It's very simple to use and very powerful. My script example today will use LWP::Parallel::UserAgent to POST form data ( in my case, login to ) a bunch of sites in parallel. Then it will grep the response data from each site looking for data we expect to see in the repsonse frmo the POST ( after we log in ).

#!/usr/bin/perl
############################################################
# http://greg-techblog.blogspot.com
# Post data to a bunch of sites in parallel and test the responses for a string
############################################################
use strict;
use warnings;
use Getopt::Long;
use LWP::Parallel::UserAgent;
use HTTP::Request::Common;
use HTTP::Cookies;

# --debug or -d as arg will turn on debugging
my $debug;
GetOptions( 'debug|d' => \$debug );


# Regex to check for in response content from post
my $regex = q{logout};

# List of sites we want to Connect to
my @sites = ( www.siteOne.com, www.siteTwo.com, www.siteThree.com );
# Define your form fields and values for Post requests
my %form = (
'login_username' => 'admin',
'login_password' => 'password',
'hidden_form_field1' => '1',
);
# Url encode form
my $form_encoded = join '&', map { "$_=$form{$_}"; } keys %form;
# Replace spaces for url encoding
$form_encoded =~ s/\s/%20/g;

# Header object for request
my $header = HTTP::Headers->new('Content-Type' => 'application/x-www-form-urlencoded');

# Create request objects for each site
my $reqs = [];
map {
my $req = HTTP::Request->new( 'POST', qq{$_/},$header, $form_encoded);
push(@$reqs,$req);
} @sites;


# Execute request
my %stdout = parallel_reqs($reqs,$regex);

map { print "Failed to login to $_\n" unless $stdout{$_}; } keys %stdout;

######################################################
# Post form data to a bunch of sites and grep the
# response data for a string
# Pass:
# ref array of HTTP::Request objects
# regex to check response for
# Return:
# A hash keyed by urls with boolean success/fail values
######################################################
sub parallel_reqs {
my ($reqs,$regex) = @_;
my $pua = LWP::Parallel::UserAgent->new();
# $pua->in_order (1); # handle requests in order of registration
$pua->duplicates(1); # do not ignore duplicates
$pua->timeout (60); # in seconds
$pua->redirect (1); # follow redirects
$pua->nonblock (0); # disable nonblocking
$pua->max_hosts (8); # number of hosts to connect to at once
$pua->max_req (20); # number of request per host

# Allow POSTS to be redirected
push @{ $pua->requests_redirectable }, 'POST';

# Enable Cookies
my $cookie = HTTP::Cookies->new({});
$pua->cookie_jar($cookie);

# Start our requests
foreach my $req (@$reqs) {
if ( my $res = $pua->register($req) ) {
print STDERR $res->error_as_HTML if $debug;
}
}
my $entries = $pua->wait(10);

# Loop through the response content looking for our regex
my %return;
foreach (keys %$entries) {
my $response = $entries->{$_}->response;
print $response->request->url."\t\t".$response->message."\t".$response->code."\n" if $debug;
$return{$response->request->url} = $response->content =~ /$regex/ ? 1 : 0;
}
return %return;
}

6.25.2009

Perl One-Liner Multithreaded Ping Scanner

Just another one-liner today. This launches a thread for each address you want to scan in the range you provide.

perl -e 'use threads; ($n,@r) = @ARGV; map { $t{$_} = new threads sub { $i = shift; print "$_ up\n" if `ping -c 1 $n$i` =~ /1 received/; }, $_;} $r[0]..$r[1]; map $t{$_}->join, keys %t;' 192.168.50. 1 254

6.18.2009

Get a List of IPs that have Been Refused

Just a one-liner today.

grep refused /var/log/secure | perl -ne '@l = $_; map { /from ::ffff:((?:\d{1,3}\.){3}\d{1,3}\b)/; $h{$1}++; } @l; END {map { print "$_ = $h{$_}\n" if $h{$_} > 10;} keys %h; }'

This will print a list of all IPs that have received a 'refused connection' message more than 10 times and print how many times each has been refused.

6.16.2009

Forking in Perl

Yesterday I posted how to thread in perl. Today I figured I'd explain forking. There are a few reasons you might choose to fork instead of thread. One of them is that threading is a compile time option for perl that isn't enabled in every installation. Another is that not every module is thread safe. Most of the time you can just unload any modules that are not before you start threading, but if you need the module inside your thread, you'll need to switch to forking. Lastly forking is more effecient than threading in some cases. I'll actually explain this more in detail in a later post.

Here I've written a sample script that is similar to my threading function.


######################################
# http://greg-techblog.blogspot.com
# Fork child processes for a list of hosts running a function
# you pass to it. Passes host to function as argument.
# args:
# needs numbers of child to fork at a time
# ref to sub to run
# ref array of hosts or instances (will be passed to sub as arg)
# returns:
# a hash of returns from sub keyed by host or instance name
######################################
use strict;
use warnings;
my $debug;

sub fork_run {
use POSIX qw( WNOHANG );
my ($children,$func_to_fork, $hosts) = @_;
my %stdout;
my $n = 0;

# This makes sure we don't start more than $threads threads at a time
my $m = $children-1 > $#{$hosts} ? $#{$hosts} : $children-1;

while (1) {
last if $n >= $#{$hosts}+1;
foreach my $host (@{$hosts}[$n..$m]) {
print "Forking for $host $n-$m\n" if $debug;
local *FROM_CHILD;
pipe(FROM_CHILD,TO_PARENT);
my $pid = fork();
if ( not defined $pid) {
# Something is wrong
print "resources not available.\n";
}
elsif ($pid == 0 ) {
# This is the child
# Turn on autoflush
$| = 1;
close(FROM_CHILD);
select(TO_PARENT);
$func_to_fork->($host);
close(TO_PARENT);
exit;
}
else {
# This is the parent
close(TO_PARENT);
$stdout{$host} = ;
close FROM_CHILD;
}
}

# wait for some children to finish
my $kid;
do {
$kid = waitpid(-1, WNOHANG);
} while $kid > 0;

# work out the next range of instances to work on
$n = $m == 0 ? 1 : $m+1;
$m = $n+$children-1 < $#{$hosts} ? $n+$children-1 : $#{$hosts};
print "$n-$m\n\n" if $debug;
}
return %stdout;
}



To use this, simply do something like this;

use Data::Dumper;
my @hosts = qw{ host1 host2 host3 host4 host5 };
sub myFunc { my $hostname = shift; print TO_PARENT `ping -c 1 $hostname` ; return 1; }

# Launch two threads at a time that ping hosts
my %stdout = fork_run( 2, \&myFunc, \@hosts);
map { print $stdout{$_}; } keys %stdout;


6.15.2009

Threading in Perl

Threading in Perl may seem daunting at first, but it's actually easier than it looks. In some ways, threading is more straight forward than forking. Communication between the threads and a parent process is much more straight forward in my opinion than communication between child and parent processes.

Here I've written a little sample function that will take a number of threads to launch at a time, a reference to a function and an array of hostnames. Then it will launch one thread for each hostname passing the hostname to the function, until the limit has been reached. It then waits for each thread to become joinable (finishes) and then launches more threads till the limit is reached again. When all threads are finished, it returns the output form all threads as a hash keyed by the hostnames.



######################################

# http://greg-techblog.blogspot.com
# Spawns threads for a list of hosts running a function
# you pass to it.
Passes host to function as argument.
# args:
# needs numbers of threads to launch at a time
# ref to sub to run
# ref array of hosts or instances (will be passed to sub as arg)
# returns:
# a hash of returns from sub keyed by host or instance name
######################################
use threads;
my $debug;
sub run_threaded {

my ($threads,$func_to_thread, $hosts) = @_;
my (%stdout, %threads);
my $n = 0;

# This makes sure we don't start more than $threads threads at a time

my $m = $threads-1 > $#{$hosts} ? $#{$hosts} : $threads-1;
while (1) {
last if $n >= $#{$hosts}+1;
foreach my $host (@{$hosts}[$n..$m]) {
print "Launching thread for $host $n-$m\n" if $debug;
unless ( $threads{$host} = new threads $func_to_thread, $host) {
print "Error on $host\n";
}
}
map { $stdout{$_} = $threads{$_}->join if $threads{$_}; } @{$hosts}[$n..$m];
# work out the next range of instances to work on
$n = $m == 0 ? 1 : $m+1;
$m = $n+$threads-1 < $#{$hosts} ? $n+$threads-1 : $#{$hosts};
print "$n-$m\n\n" if $debug;
}
return %stdout;
}



To use this, simply do something like this;

use Data::Dumper;
my @hosts = qw{ host1 host2 host3 host4 };
sub myFunc { my $hostname = shift; return `ping $hostname` ; }

# Launch two threads at a time that ping hosts
my %stdout = run_threaded( 2, \&myFunc, \@hosts);
print Dumper \%stdout;

Kill a Ton of Queries in MySQL

Every so often I'll have a misbehaving query that read locks a table that is essential for every other query in our application. Obviously this is a bug in our application that should have never made it past the Q/A process, but lets get back to the real world where programmers make these mistakes and Q/A isn't perfect. When this happens, a ton of queries backup behind the query that read locked the table. At this point, simply killing the primary query isn't going to fix the problem, because the load generated by all the queries waiting for the read lock to end will kill the server. There is only two solutions at this point, restart MySQL or kill all the stuck queries. Killing all the queries is really the best solution for our application, but the number is typically daunting. Most of the time I've got as many queries as whatever my connection limit is set to. So I came up with this little one-liner to get a list of all the processes for a user on the server, and kill them.

mysql -u username -ppassword -e "show processlist" | sed '1d' | grep mysql_username | awk '{print "kill ", $1, ";"}' | xargs -i mysql -u username -ppassword -e "{}"

I'm sure someone has a better way to do this, but this works pretty awesomely.

Inventory Your Machines with a One-liner

The command dmidecode dumps the DMI table from your machines BIOS in human readable text. Depending on your machine, this should contain some really useful info. Basically a total hardware profile on most machines, including info like service tags and model numbers. I recently needed to quickly inventory a bunch of Dell servers and came up with this one-liner that will production a CSV line of text for the machine it's run on. Combine these lines of text into one file and import into Excel, inventory done.

Format is:
'Hostname','Distro-Release Version','UUID','Service Tag','Model','Processor Version','Memory'

echo `hostname` , `cat /etc/*-release` , `dmidecode |egrep UUID |sed "s/\s\+UUID: //"` , `dmidecode -s system-serial-number` , `dmidecode -s system-product-name` , `dmidecode -s processor-version` , `dmesg |egrep Mem |awk '{ print $2; }'`

Now of course you'll need to modify this command for your own hardware, this won't even work for every Dell. But, the output of dmidecode is readable enough to just scan through it and figure out what you need. Also look in /proc/ for other info you might want to add to your inventory.

6.08.2009

Add Up a Column of Data with Awk

When you pipe a line of text to awk, every "word" (string of non-white space) shows up in $n, where n is equal to the order for which it appears in the line of text. For example;

>echo this is a test | awk '{print $2};'

prints
is

If you have a column of numbers, say from the output of a command or a log file, you can do math with these numbers like this

ps aux | awk '{ t =+ $6}; END { print t};'

This will give you the total for the sixth column of the output from the command ps aux. In another example, lets say I wanted to see the total number of clients with active and inactive connections on my LVS server. The command to see this info is


[root@lvs~]#ipvsadm -L
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP www:https sed
-> webapp06:https Route 110 54 86
-> webapp01:https Route 70 36 46
-> webapp04:https Route 300 148 207
-> webapp03:https Route 300 146 307
-> webapp02:https Route 120 58 77
-> webapp05:https Route 300 142 183

That's nice, but you'll notice I don't get totals. The command doesn't have any syntax to give me totals either. So we'll let awk take care of it. First we need to grep just the lines that we want to add up. In this case they all have the text 'webapp0' in them, and that string of text doesn't appear on any lines we don't want to add up.

[root@lvs~]#ipvsadm -L |grep webapp0
-> webapp06:https Route 110 72 80
-> webapp01:https Route 70 45 53
-> webapp04:https Route 300 198 267
-> webapp03:https Route 300 196 268
-> webapp02:https Route 120 79 105
-> webapp05:https Route 300 199 240

Now the two columns we'd like to add up are the 5th and the 6th.

[root@lvs~]#ipvsadm -L |grep webapp0|awk '{s += $5; i += $6} END { print s,i; };'
846 1066

Now lets say instead of wanting a total, we wanted an average? 'NR' is equal to the number of rows we sent awk.

[root@lvs~]#ipvsadm -L |grep webapp0|awk '{s += $5; i += $6} END { print s/NR,i/NR; };'
68.5833 88.4167






6.04.2009

MaxClients in Apache with Preforking MPM

The advice for figuring out what your MaxClients setting should be in Apache with the Preforking MPM is pretty straight forward. First figure out the average amount of RAM each Apache process uses. Then divide the amount of RAM your system has, minus a little for other processes, by the average amount of RAM the average Apache process uses. In CentOS the following command should give you that number.

ps -ylC httpd |awk "{ s += \$8 } END { m = `cat /proc/meminfo|grep MemTotal|awk '{print $2}'`; printf \"MaxClients: %i\\n\", (m-(m*.15))/(s/NR) }"

5.27.2009

Working with RPMs on CentOS

I'm pretty much going to just start spitting all my favorite rpm commands here. Assuming CentOS, but most of this is applicable to any rpm based distro.

List every rpm installed on your system, pipe it through grep to find something specific
rpm -qa

Info about a specific rpm, version number, build host an other useful info
[root@webapp01 ~]# rpm -qi php
Name : php Relocations: (not relocatable)
Version : 5.2.2 Vendor: (none)
Release : 3 Build Date: Thu 13 Mar 2008 04:31:00 PM PDT
Install Date: Fri 14 Mar 2008 11:02:12 AM PDT Build Host: dev.myhost.com
Group : Development/Languages Source RPM: php-5.2.2-3.src.rpm
Size : 12049373 License: The PHP License v3.01
Signature : (none)
URL : http://www.php.net/
Summary : The PHP HTML-embedded scripting language. (PHP: Hypertext Preprocessor)
Description :
PHP is an HTML-embedded scripting language. PHP attempts to make it
easy for developers to write dynamically generated webpages. PHP also
offers built-in database integration for several commercial and
non-commercial database management systems, so writing a
database-enabled webpage with PHP is fairly simple. The most common
use of PHP coding is probably as a replacement for CGI scripts.

The php package contains the module which adds support for the PHP


What rpm a specific file is from
rpm -qf /path/filename

List everything installed by an rpm
rpm -ql packagename

Turn any perl CPAN module into an rpm and install it
cpan2rpm -i Getopts::Long

After you've installed a src rpm, change into /usr/src/redhat/SPECS/ and execute this using your package's specfile and it builds an rpm and installs it. If you don't have a src rpm, but you've got a specfile for a source tarball, put the specfile in /usr/src/redhat/SPECS/ and the tarball in /usr/src/redhat/SOURCES .
rpmbuild -bi packname.spec

Creates an rpm repo in your current directory from the rpms in that directory
createrepo .

Searches all your rpm repos for a specific file
yum provides '*/rpmbuild'

List/Install/Remove package groups. Package groups are groups of rpms that are generally needed together, IE 'Development Libraries' or 'KDE (K Desktop Environment)'
yum grouplist / yum groupinstall / yum groupremove

Fun with Find

Recently I needed to clean up a bunch of temp files being create by a web application but, not being properly removed. I turned to find as a stop gap solution to this problem while the real problem was being solved. In my situation I knew that no file should survive longer than two hours in the directory where the files were being created. So using find I check the ctime on the files and remove them if it is more than 120 minutes old. Using find's -cmin option with +120 will give us files that have not been changed in the last 120 minutes, a negative number (-120) would give us files only files that have been changed in the last 120 minutes. Since I only want to delete files, and keep directories I also use the -type f option. If I simply run

shell> find /path/to/files -cmin +120 -type f

This will give me a list of all files that have not been changed in the last 120 minutes below the path I've given find. If I wanted to make sure find didn't list files in subdirectories I would add the option -maxdepth 1.

So now we have a list of the files we want to delete, using the -exec option will allow us to delete them. This is pretty much the same as piping the out put of find to xargs but it should use a little less overhead. With the -exec option foreach thing we've found with find the command we supply is executed. Find 25 items with find and your command is executed 25 times. Since we want the files we've found to be part of the command, we use '{}' as a variable. We also need to let find know when the command we are supplying is finished. You use a ';' for that, but since ';' has it's own meaning in bash, it needs to be escaped. So our option to remove all the files we've found without confirmation should looks like

-exec rm -f '{}' \;

If we used '+' instead of ';' find would try to string many of our files together and cause the command to be executed fewer times. This could be more efficient depending on the command we are running.

Now I needed to run this command every hour, so I created a little bash script to start this process up. I set it's nice level high so it's priority would be low, and setup a little checking to make sure we don't start this job up again if it hasn't finished from the last time it was started.

#!/bin/bash

# http://greg-techblog.blogspot.com
TMPPATH=/var/www/html/tmp/

# Check if cleantmp.cron is already running
if [ ! -f /var/run/cleantmp.pid ]; then
# Delete all files in tmp that are older than two hours
# run at lowest priority
nice -n 19 find $TMPPATH -nowarn -cmin +120 -type f -exec rm -f '{}' \; &
PID=$! # Get the pid of the find process
echo "$PID" > /var/run/cleantmp.pid # create a pidfile
wait $PID # wait for find to finish
rm -f /var/run/cleantmp.pid # remove pidfile
exit
fi

# cleantmp is already running
echo cleantmp.cron already running
cat /var/run/cleantmp.pid # print pid of find process

5.26.2009

Dealing with Memory Usage in Apache

Whatever amount of memory an Apache process uses to full fill a request, that process will continue to hold onto that amount of memory for the lifetime of that process. When you graceful Apache you cause all your current processes to end as soon as they are finished serving the requests they have already received, and then Apache creates a new set of processes. This typically frees up large blocks of memory used to serve old requests. Apache does this on it's own and it's controlled by the MaxRequestsPerChild directive. This is the number of requests a process should serve during it's lifetime. Making this number too low will cause Apache to tear down and create new processes quickly, increasing CPU overhead. Setting MaxRequestsPerChild too high will cause your machine to use more memory than it likely needs to, possibly causing your system to swap, or even run out of memory completely. Apache has no other means of garbage collection.

The default config for Apache on CentOS 5.3 has MaxRequestsPerChild set at 4000. I've seen people suggest that if your application is primarily serving dynamic content that a setting as low at 20 may make sense. I've opt'ed for 800 to 1000. It's a number I'm still playing with. Setting it this low has reduced the overall amount of memory my systems are using without increasing my CPU load significantly. It's something you'll need to experiement with.

5.22.2009

Centralize Syslogging Quick and Dirty

Lets start with that this shouldn't be done across a public network. If you need to do that, you should do basically the same thing as all this, except through stunnel, but I'm not going to get into how to setup all that right now. Basically what we are going to do is setup one machine to accept syslog messages from anywhere (remember, private network here, it would be a security risk to allow anyone to send you syslog messages). Then we configure another machine to send it's syslog messages to the first machine. Last, we'll setup cron jobs on both machines to make sure we'll still recieving messages from the remote server.

I'm using CentOS/Redhat for this, so this setup will differ a little on different distros, but not much.

Start with the machine we want to recieve syslog messages on. We'll call this machine foo and the machine we want to send messages from bar. Inventive... I know.

On foo we open up /etc/sysconfig/syslog and add the -r option

[root@foo ~]# vim /etc/sysconfig/syslog

# Options to syslogd
# -m 0 disables 'MARK' messages.
# -r enables logging from remote machines
# -x disables DNS lookups on messages recieved with -r
# See syslogd(8) for more details
SYSLOGD_OPTIONS="-m 0 -r"
# Options to klogd
# -2 prints all kernel oops messages twice; once for klogd to decode, and
# once for processing with 'ksymoops'
# -x disables all klogd processing of oops messages entirely
# See klogd(8) for more details
KLOGD_OPTIONS="-x"
#
SYSLOG_UMASK=077
# set this to a umask value to use for all log files as in umask(1).
# By default, all permissions are removed for "group" and "other".


Now restart syslog with

[root@foo ~]# service syslog restart
Shutting down kernel logger: [ OK ]
Shutting down system logger: [ OK ]
Starting system logger: [ OK ]
Starting kernel logger: [ OK ]

Next switch over to bar and open up the syslogd.conf file. Use @foo as the destination and send over whatever messages you'd like. For me, that's just going to be the stuff that normally ends up in /var/log/messages and /var/log/secure.

[root@bar ~]# vim /etc/syslog.conf

# Log all kernel messages to the console.
# Logging much else clutters up the screen.
#kern.* /dev/console

# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none;local0.none /var/log/messages
*.info;mail.none;authpriv.none;cron.none;local0.none @foo

# The authpriv file has restricted access.
authpriv.* /var/log/secure
authpriv.* @foo
# Log all the mail messages in one place.
mail.* -/var/log/maillog

# Log cron stuff
cron.* /var/log/cron

# Everybody gets emergency messages
*.emerg *

# Save news errors of level crit and higher in a special file.
uucp,news.crit /var/log/spooler

# Save boot messages also to boot.log
local7.* /var/log/boot.log

local0.* /var/log/squid.log


Now still on bar add foo to /etc/hosts.

# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost bar
192.168.1.101 foo


Then restart syslog on bar.

[root@bar ~]# service syslog restart
Shutting down kernel logger: [ OK ]
Shutting down system logger: [ OK ]
Starting system logger: [ OK ]
Starting kernel logger: [ OK ]

Now test it out. Use the command logger on bar.

[root@bar ~]# logger test

And then on foo, tail messages and see if you go it

[root@quagmire ~]# tail -1 /var/log/messages
May 22 17:34:29 bar root: test


Now for our little verification script. The way this will work is every hour we'll create a syslog message on bar and on foo we'll complain if it's been more than two hours since we got that message from bar.

On bar change director to /etc/cron.hourly/ and create a shell script called rollcall.sh

[root@bar cron.hourly]# vim rollcall.sh
#!/bin/bash
/usr/bin/logger checking in

Now chmod it executible.
[root@bar cron.hourly]# chmod +x rollcall.sh

On foo change into /usr/local/sbin/ create a perl script called rollcall_check.pl. This script requires Sys::SyslogMessages and IO::Capture::Stdout. Use cpan2rpm to install them if you don't already have both modules.

[root@foo sbin]# vim rollcall_check.pl
#!/usr/bin/perl -w
# http://greg-techblog.blogspot.com
# This is a dirty hack, but it gets the job done
use strict;

use Sys::SyslogMessages;
use IO::Capture::Stdout;

my $rollcall_conf="/usr/local/etc/rollcallhosts.conf";

open (HOSTS, "$rollcall_conf") || die("Can't open $rollcall_conf");
my @hosts_conf = ;
close HOSTS;
my %hosts = ();
map {chomp; my $key = $_; $hosts{$key} = 0;} @hosts_conf;
my $capture = IO::Capture::Stdout->new();
$capture->start();
my $messages = new Sys::SyslogMessages();
$messages->tail({'number_minutes' => '120'});
$capture->stop();
my @all_lines = $capture->read;
my @matches;
map {$_ =~ /\d\d:\d\d:\d\d\s(\w+).*checking in/; push @matches, ($1) if $1;} @all_lines;
foreach my $host (sort(keys(%hosts))){
map {$hosts{$host} = 1 if $_ eq $host} @matches;
print "$host did not report in.\n" if ! $hosts{$host};
}


Now chmod it executible.

[root@foo cron.hourly]# chmod +x
rollcall_check.pl

Create one more file in /usr/local/etc/ called rollcallhosts.conf. This will just be a list of hosts we expect to hear from. One hostname per line.

[root@foo etc]# vim rollcallhosts.conf
bar

Finally symlink
rollcall_check.pl in /etc/cron.hourly.

[root@foo cron.hourly]# ln -s /usr/local/sbin/rollcall_check.pl


Now our script will only complain if bar hasn't check in with foo at least once in the last two hours and root@foo should get an email about it. There is one little problem with this, and that's when logrotate on foo kicks in, you'll likely get a false message. Oh well, that shouldn't happen too often.

Extend a Mounted LVM EXT3 Partition

Prerequisites:

You need to have some extra unused space in your volume group. That's pretty much it.


Dump:


[root@webapp /]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
5.9G 4.5G 1.1G 81% /
/dev/xvda1 99M 29M 66M 31% /boot
tmpfs 3.0G 0 3.0G 0% /dev/shm

[root@webapp /]# pvscan
PV /dev/xvda2 VG VolGroup00 lvm2 [15.50 GB / 7.56 GB free]
Total: 1 [15.50 GB] / in use: 1 [15.50 GB] / in no VG: 0 [0 ]

[root@webapp /]# lvextend -L+3G /dev/VolGroup00/LogVol00
Extending logical volume LogVol00 to 9.00 GB
Logical volume LogVol00 successfully resized

[root@webapp /]# resize2fs /dev/VolGroup00/LogVol00
resize2fs 1.39 (29-May-2006)
Filesystem at /dev/VolGroup00/LogVol00 is mounted on /; on-line resizing required
Performing an on-line resize of /dev/VolGroup00/LogVol00 to 2359296 (4k) blocks.
The filesystem on /dev/VolGroup00/LogVol00 is now 2359296 blocks long.

[root@webapp /]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
8.8G 4.5G 3.9G 54% /
/dev/xvda1 99M 29M 66M 31% /boot
tmpfs 3.0G 0 3.0G 0% /dev/shm


Done.

Setup or Fix MySQL Replication Fast

So here is how to use LVM snapshots to setup or fix MySQL replication quickly.

Prerequisites:
First you need to have used LVM on the location of your MySQL store. I'm not walking you through that. And you need to have left some unused space in the volume group that your logical volume is in. If you don't know what any of that means, go read up on LVM.

Second, you're going to need to create an ssh key. Also not walking you through that. Setup an ssh key on your slave server, so that your master server can ssh to it as root without using a password.

Third, the script I've written assumes your MySQL store is in /var/lib/mysql. And I've written this for CentOS/Redhat, so I'm assuming
service mysql stop
Shuts down your mysql server. Both of these should be easy to fix for your installation.

Fourth, make sure skip_slave_start is in your my.cnf on your slave server.

Last, you'll need rsync, perl and the modules Linux::LVM and Getopt::Long. If you're using an rpm based distro, use cpan2rpm to install those modules. Everyone else, CPAN or your package manager.

The Plan

So what we are going to do is simple. We shutdown MySQL on the slave server. Then connect to your master database and get a list of all your databases, lock all the databases and get the master log position.
Make sure a replication user is setup. Then we create a snapshot, and unlock the master. Mount the snapshot and rsync the databases we got from the master to the slave. Unmount and remove the snapshot on the master. Start up the slave, set the log position and start up replication. Done.

The Script


#!/usr/bin/perl -w
# Script to fix or setup MySQL replication
# http://greg-techblog.blogspot.com

use strict;
use DBI qw{:sql_types};
use Linux::LVM;
use Getopt::Long;

# Database user. Assumes the slave user and pass on both slave and master
my $dbuser = q{USERNAME};
my $dbpass = q{PASSWORD};

# Should be an ip address, not a hostname
my $dbhost_master = q{192.168.1.2};
my $dbhost_slave = q{192.168.1.3};

# Replication user
my $replication_user = q{repl};
my $replication_pass = q{repl};

# Databases to skip when rsyncing the data director
my @skip = qw{information_schema};

# Name for our LVM snapshot
my $snapshot_name = q{dbbackup};

# Volume Group name where we are creating the snapshot
my $volgroup = q{VolGroup00};

# Logical Volume to snapshot
my $logvol = q{LogVol02};

# Location of ssh key
my $ssh_key = q{/root/.ssh/id_rsa};


my $dsn1 = qq|DBI:mysql:database=mysql;host=$dbhost_master;port=3306|;
my $dsn2 = qq|DBI:mysql:database=mysql;host=$dbhost_slave;port=3306|;
my $help;

GetOptions (
"master|m=s" => \$dbhost_master,
"slave|s=s" => \$dbhost_slave,
"ruser=s" => \$replication_user,
"rpass=s" => \$replication_pass,
"duser=s" => \$dbuser,
"dpass=s" => \$dbpass,
"key|k=s" => \$ssh_key,
"vol|v=s" => \$volgroup,
"log|l=s" => \$logvol,
"help|?|h" => \$help,
);

# Print help
if ($help) {
print "
$0 [options]\n
--master/-m IP address of master server (Default: $dbhost_master)
--slave/-s IP address of slave server (Default: $dbhost_slave)
--ruser Replication username (Default: $replication_user)
--rpass Replication password (Default: $replication_pass)
--duser Database admin user (Default: $dbuser)
--dpass Database admin password (Default: XXXXXX)
--key/-k Full path to ssh-key (Default: $ssh_key)
--vol/v Volume Group (Default: $volgroup)
--log/l Logical Volume (Default: $logvol)
--help/-?/-h This help\n
";
exit 0;
}

# Define a statement handle
my $sth;
# Check to see is snapshot volume exists already and die if it does
my %lvm = get_logical_volume_information($volgroup);
my @lvm = keys %lvm;
if (grep /\/$volgroup\/$snapshot_name$/, @lvm) {
die ("Snapshot volume already exists. Use lvmremove to remove it before running this command.\n");
}

# Shutdown Mysql on slave server
unless ( -e $ssh_key ) {
die ("Ssh key identity file missing: $ssh_key");
}
exe_cmd(
qq{ssh -i $ssh_key root\@$dbhost_slave "service mysql stop"},
qq{Couldn't ssh to slave and stop mysql}
);
print "Mysql on slave stopped\n";

#Connect to Master database
my $dbh = DBI->connect(
$dsn1,
$dbuser,
$dbpass,
{RaiseError => 0, AutoCommit => 1 }
) or die ("Error Connecting to server: ".DBI::errstr);
my $sth_dbs = $dbh->prepare(q{SHOW DATABASES});
$sth_dbs->execute or die("Problem executing query: ".$sth_dbs->errstr);

# Lock all tables on master database
$sth = $dbh->prepare(q{FLUSH TABLES WITH READ LOCK});
$sth->execute or die("Probelm executing query: ".$sth->errstr);
print "All Databases on master locked\n";

# Get the master log position and file
$sth = $dbh->prepare(q{SHOW MASTER STATUS});
unless ($sth->execute) {
my $err_msg = $sth->errstr;
my $sth_unlock = $dbh->prepare(q{UNLOCK TABLES});
$sth_unlock->execute;
die("Probelm executing query: $err_msg\n".$sth_unlock->errstr);
}
my $master_status;
unless ($master_status = $sth->fetchrow_hashref) {
my $err_msg = $sth->errstr;
my $sth_unlock = $dbh->prepare(q{UNLOCK TABLES});
$sth_unlock->execute;
die("Probelm executing query: $err_msg\n".$sth_unlock->errstr);
}

# Create the snapshot
my $lvcreate_msg = `lvcreate -L1G -s -n $snapshot_name \
/dev/$volgroup/$logvol 2>&1`
;
if ( $? ) {
my $sth_unlock = $dbh->prepare(q{UNLOCK TABLES});
$sth_unlock->execute;
die("Couldn't create snapshot: $lvcreate_msg $?\n".$sth_unlock->errstr);
}
print "Snapshot created\n";

# Unlock all tables
$sth = $dbh->prepare(q{UNLOCK TABLES});
$sth->execute or die("Probelm executing query: ".$sth->errstr);
print "All databases on master unlocked\n";

# Grant replication rights on master
$sth = $dbh->prepare(q{GRANT REPLICATION SLAVE ON *.* TO ?@? IDENTIFIED BY ?});
$sth->execute($replication_user,$dbhost_slave,$replication_pass)
or die("Problem executing query: ".$sth->errstr);

# Mount the snapshot
unless ( -d qq{/mnt/$snapshot_name}) {
mkdir qq{/mnt/$snapshot_name} or die ("Couldn't create mount point directory: $?");
}
my $mount_snapshot_msg = `mount /dev/$volgroup/$snapshot_name \
/mnt/$snapshot_name -onouuid,ro 2>&1`
;
die ("Couldn't mount snapshot: $mount_snapshot_msg $?") if $?;

# Start rsyncing the snapshot to the slave
print "Starting rsync\n";
my $out;
while ( my ($db) = $sth_dbs->fetchrow_array ) {
unless (grep /^$db$/, @skip) {
$out .= `rsync -zrav /mnt/$snapshot_name/lib/mysql/$db/ \
$dbhost_slave:/var/lib/mysql/$db/ 2>&1`
;
}
die ("rsync failed: $out $?") if $?;
}

# Diconnect from Master Database
$dbh->disconnect;

# Get rid of snapshot
print $out;
exe_cmd(
qq{umount /mnt/$snapshot_name},
qq{Couldn't umount snapshot}
);
exe_cmd(
qq{lvremove -f /dev/$volgroup/$snapshot_name},
qq{Couldn't remove snapshot volume}
);
print "Snapshot removed\n";

# Start Mysql back up on slave (skip_slave_start better be in the my.cnf)
exe_cmd(
qq{ssh -i $ssh_key root\@$dbhost_slave "service mysql start"},
qq{Couldn't ssh to slave and start mysql}
);
print "Mysql on slave started\n";

# Connect to slave and setup replication
$dbh = DBI->connect( $dsn2, $dbuser, $dbpass, { RaiseError => 0, AutoCommit => 1 } )
or die ("Error Connecting to server: ".DBI::errstr);
my $change_master_query = q{
CHANGE MASTER TO MASTER_HOST = ?,
MASTER_USER = ?,
MASTER_PASSWORD = ?,
MASTER_LOG_FILE = ?,
MASTER_LOG_POS = ?
};
$sth = $dbh->prepare($change_master_query) or die ("Problem preparing query\n $change_master_query\n".DBI::errstr);
# Make MASTER_LOG_POS an INT
$sth->bind_param(5,1,SQL_INTEGER);
$sth->execute(
$dbhost_master,
$replication_user,
$replication_pass,
$master_status->{'File'},
int($master_status->{'Position'})
) or die ("Problem changing master:".$sth->errstr);

# Slave slave
$sth = $dbh->prepare(q{START SLAVE});
$sth->execute or die ("Problem starting slave:".$sth->errstr);

# Check to see if everything worked
print "Sleep for 5 seconds...\n";
sleep 5; # this may need to be longer
$sth = $dbh->prepare(q{SHOW SLAVE STATUS});
$sth->execute;
my $slave_status = $sth->fetchall_arrayref({});
print "Slave_SQL_Running:\t".$slave_status->[0]->{'Slave_SQL_Running'}."\n";
print "Slave_IO_Running:\t".$slave_status->[0]->{'Slave_IO_Running'}."\n";
print "Slave_IO_State:\t".$slave_status->[0]->{'Slave_IO_State'}."\n";
print "Last_IO_Error:\t".$slave_status->[0]->{'Last_IO_Error'}."\n";
print "Seconds_Behind_Master:\t".$slave_status->[0]->{'Seconds_Behind_Master'}."\n";

sub exe_cmd {
my $cmd = shift;
my $fail_msg = shift;
my $msg = `$cmd 2>&1`;
chomp $msg;
die ("$fail_msg : $msg\n$?") if $?;
}
__END__

OSX Time Machine Over Your Network Without a Timecapsule

So I spent the other day figuring out how to do this, and then when I went to price buying a NAS, timecapsules were going for $200, so I bought one. I'm generally happy with it. Thinking I had wasted my time figuring out how to make timemachine work with an unsupported network device, my co-worker mentioned to me today that he needed to get this working for our Macs at work. Ha! I said, I'll get to use this bit of knowelege after all. So here it is.

Start by connecting to your network storage through either SMB or AFP. Create a file called

.com.apple.timemachine.supported

Next you're going to need your machine name and the MAC address of your network card.

Drop to a terminal to get both of these. You machine name should be your host name, or the first part of your command prompt, the bit before the colon.

Laptop:~ greg$

Next now, execute
ifconfig

Locate the line that says
ether
under the network interface card. That's your MAC address.

en0: flags=8863 mtu 1500
ether 00:13:db:9d:62:fc


Now take that machine name and MAC address minus the colons to create the name of the disk image you'll need to create to make this thing work. Put them together seperated by an underscore.

Laptop_0013db9d62fc

Now you'll need to create a disk image. Use this command, replacing your image name with mine and add .sparsebundle to the end of it.


Laptop:~ greg$ hdiutil create -library SPUD -megabytes 204800 -fs HFS+J \
-type SPARSEBUNDLE -volname "Laptop_0013db9d62fc.sparsebundle" \
"Laptop_0013db9d62fc.sparsebundle
"



You should also set the size of your disk image about twice the size of the drive you are backing up. After the disk image is finished being created, drop it on your network storage device.

One last step and you're good to go. Execute this at the terminal.

defaults write com.apple.systempreferences\
TMShowUnsupportedNetworkVolumes 1


Now connect to your network storage device and open up timemachine, hit
Change Disk
and your network storage device should show up as an available device.

Of course this is all subject to change. I read through a number of
how-to postings on this written a different times, and it seems apple
keeps making this process more and more complicated, but as of the
writing of this posting, this process works.