Archive for the ‘Cassandra’ Category
Without an existing cassandra service script, I decided to go ahead and create one, to make things a little easier to manage, and to make the whole experience a little more user friendly
The script includes a few nodetool basics, such as repair, cleanup, info, netstats etc. And will log the start and end times in its own log for repair and cleanup, allowing you to see how long the process takes without having the trawl through all the cassandra logs to find a start and end time (very useful for us when it takes over 5 hours to complete a repair).
Here is the script, simply copy the content into /etc/init.d/cassandra and make it executable
#!/bin/bash
#
# Author: Brooke Bryan
#
#
# Description: Cassandra Server.
# Processname: cassandra
# Config: /usr/local/share/cassandra/conf/cassandra.yaml
# Source function library.
. /etc/rc.d/init.d/functions
# Source networking configuration.
. /etc/sysconfig/network
prog="Cassandra"
pidfile="/var/run/cassandra.pid"
progbin="/usr/local/share/cassandra/bin"
lock="/var/lock/subsys/cassandra"
logfile="/var/log/cassandra/service.log"
WriteLog()
{
echo "`date`: $@" >> $logfile
}
LogInfo()
{
echo "$@"
WriteLog "INFO: $@"
}
LogWarning()
{
echo "$@"
WriteLog "WARNING: $@"
}
start()
{
if [ -f $pidfile ] && checkpid `cat $pidfile`; then
action "$prog is already running." /bin/false
exit 0
fi
WriteLog "Starting $prog"
daemon "$progbin/cassandra" -p $pidfile >> $logfile 2>&1
usleep 500000
RETVAL=$?
if [ $RETVAL -eq 0 ]; then
touch "$lock"
action "Starting $prog" /bin/true
else
action "Starting $prog" /bin/false
fi
WriteLog "Started $prog"
return $RETVAL
}
stop()
{
$progbin/nodetool -h localhost disablethrift
$progbin/nodetool -h localhost disablegossip
$progbin/nodetool -h localhost drain
WriteLog "Stopping $prog"
CASSIEPID=`cat "$pidfile" 2>/dev/null `
if [ -n "$CASSIEPID" ]; then
/bin/kill "$CASSIEPID" >/dev/null 2>&1
ret=$?
if [ $ret -eq 0 ]; then
STOPTIMEOUT=60
while [ $STOPTIMEOUT -gt 0 ]; do
/bin/kill -0 "$CASSIEPID" >/dev/null 2>&1 || break
sleep 1
let STOPTIMEOUT=${STOPTIMEOUT}-1
done
if [ $STOPTIMEOUT -eq 0 ]; then
echo "Timeout error occurred trying to stop $prog Daemon"
ret=1
action $"Stopping $prog: " /bin/false
LogInfo "Timeout error occurred trying to stop $prog Daemon pid($CASSIEPID)"
else
rm -f "$lock"
action $"Stopping $prog: " /bin/true
WriteLog "INFO: $prog Daemon Stopped pid($CASSIEPID)"
fi
else
action $"Stopping $prog: " /bin/false
WriteLog "WARNING: $prog Daemon Stop Failed pid($CASSIEPID)"
fi
else
ret=1
action $"Stopping $prog: " /bin/false
fi
return $ret
}
restart()
{
LogInfo "Restart Initiated"
stop
start
}
ring()
{
$progbin/nodetool -h localhost ring
}
info()
{
$progbin/nodetool -h localhost info
}
netstats()
{
$progbin/nodetool -h localhost netstats
}
repair()
{
LogInfo "Starting Repair"
$progbin/nodetool -h localhost repair
LogInfo "Completed Repair"
}
cleanup()
{
LogInfo "Starting Cleanup"
$progbin/nodetool -h localhost cleanup
LogInfo "Completed Cleanup"
}
version()
{
$progbin/nodetool -h localhost version
}
# See how we were called.
case "$1" in
start)
start
;;
stop)
stop
;;
status)
status cassandra
;;
restart)
restart
;;
ring)
ring
;;
info)
info
;;
netstats)
netstats
;;
repair)
repair
;;
cleanup)
cleanup
;;
version)
version
;;
*)
echo $"Usage: $0 {start|stop|status|restart|ring|info|netstats|repair|cleanup|version}"
exit 1
esac
exit $?
Just a quick post on how-to install cassandra on Centos 5, and getting the required bits on to stop all the errors you will see, such as JNA and MX4J missing.
First you need to get all the required modules from yum, to prepare the server.
yum -y install gcc-c++ make cmake python-devel bzip2-devel zlib-devel
yum -y install log4cpp-devel git git-core cronolog google-perftools-devel
yum -y install readline-devel ncurses-devel libtool autoconf expat
yum -y install libevent-devel flex byacc expat-devel
# Perl Modules for Thrift Install
yum -y install perl-Bit-Vector perl-Class-Accessor
# Java Install
yum -y remove jpackage-utils
wget http://dev.centos.org/centos/5/testing/x86_64/RPMS/jpackage-utils-1.7.5-1jpp.1.el5.centos.noarch.rpm
rpm -ivh jpackage-utils-1.7.5-1jpp.1.el5.centos.noarch.rpm
yum -y install xml-commons-apis xml-commons-apis-javadoc ant
yum -y install java
yum -y install log4j jakarta-commons-logging jakarta-commons-lang
yum -y install java-1.4.2-gcj-compat java-1.4.2-gcj-compat-devel
Next you will want to download the latest version of cassandra available at http://cassandra.apache.org/download/
I have chosen to install cassandra in the following location: /usr/local/share/cassandra
wget http://mirror.ox.ac.uk/sites/rsync.apache.org//cassandra/0.8.4/apache-cassandra-0.8.4-bin.tar.gz
tar -xvf apache-cassandra-0.8.4-bin.tar.gz
mkdir /usr/local/share/cassandra/
cp ~/apache-cassandra-0.8.4/* /usr/local/share/cassandra -R -f;
cd /usr/local/share/cassandra
Installing JNA is done as follows:
wget “https://github.com/twall/jna/raw/3.3.0/jnalib/dist/jna.jar” –no-check-certificate -O /usr/local/share/cassandra/lib/jna.jar
chmod 755 /usr/local/share/cassandra/lib/jna.jar
MX4J is installed with:
wget “http://downloads.sourceforge.net/project/mx4j/MX4J%20Binary/3.0.2/mx4j-3.0.2.tar.gz?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Fmx4j%2Ffiles%2FMX4J%2520Binary%2F3.0.2%2F&ts=1314263784&use_mirror=freefr”
tar zxvf mx4j-3.0.2.tar.gz mx4j-3.0.2/lib/mx4j-tools.jar
cp mx4j-3.0.2/lib/mx4j-tools.jar /usr/local/share/cassandra/lib/
chmod 755 /usr/local/share/cassandra/lib/mx4j-tools.jar
Switching to Sun Java
You will need to download the latest JDK from SUN, and can switch from OpenJDK with the following
rpm /root/jdk-7-linux-x64.rpm -ivh
/usr/sbin/alternatives –install /usr/bin/java java /usr/java/jdk1.7.0/bin/java 2 && /usr/sbin/alternatives –config java
Just enter the number of the new JDK in the selection above, and hit enter (on a fresh install, its usually #3)
Finishing up.
After you have everything above done, you should just be able to edit the config file /usr/local/share/cassandra/conf/cassandra.yaml, and then run cassandra /usr/local/share/cassandra/bin/cassandra
If you are running a cluster setup with cassandra, you can use a token calculator such as http://blog.milford.io/cassandra-token-calculator/ which will evenly spread the data across your nodes.
Also be sure to set your commit log and data directory on different disks.
The best option I have found is having the commitlog and OS on an SSD drive, and the data stored on a Raid 0 disk array with SAS drives. You want to make sure you have at least double the space available on disk as what you will be consuming with your nodes.











