11.11.15

Windows 7 will not connect to Mac OS 10.11 El Capitan

1. Verify that you can ping the El Capitan server from the Windows 7 machine.
2. Portscan the El Capitan server, make sure port 445 is responding.

All good?  Ok, here's where it gets ugly.

3. Open the command window, CMD.EXE, on Windows. Type 'regedit'.
4. Expand folders:
     HKEY_LOCAL_MACHINE
          SYSTEM
               CurrentControlSet
                    Control
5. Select the folder 'Lsa' within 'CurrentControlSet'. A list of keys should appear to the right in the Registry Editor application.
6. Select the key called "lmcompatibilitylevel". Choose "Modify" from the Edit menu.
7. Set the "Value data:" field to "3". Click "OK".
8. Quit Registry Editor.
9. Try connecting again.

22.9.15

Calculate variance and standard deviation using Welford's algorithm and AWK

boo="1 2 3 4 5 6 0.257143 0.278571 0.285714 0.271429 0.235714";
awk '{
  sum = 0;
  M = 0;
  S = 0;
  for (k=1; k <= NF; k++) {
    sum += $k;
    x = $k;
    oldM = M;
    M = M + ((x - M)/k);
    S = S + (x - M)*(x - oldM);
  }
  var = S/(NF - 1);
  print "n=" NF " mean=" sum/(NF) " var=" var " sd=" sqrt(var);
}' <<< $boo


n=11 mean=2.02987 var=4.60305 sd=2.14547



20.8.15

GNU Parallel and Rocks clusters I. Distributing BASH scripts across nodes

1. Install GNU Parallel as user:
(wget -O - pi.dk/3 || curl pi.dk/3/) | bash

Path to executable should now be:  ~/bin/parallel
Modify PATH if necessary so that the GNU Parallel you just installed is used preferentially to any other GNU parallel on the system. I did this by putting my ~/bin directory ahead of the other entries in PATH in my ~/.bash_profile file.  Like so:

# User specific environment and startup programs
PATH=$HOME/bin:$PATH
export PATH


2. Test:
rocks run host command="~/bin/parallel ::: hostname"

The output should look something like:
compute-0-0: down
compute-0-1.local
compute-0-2.local
compute-0-5.local
compute-0-4.local

Here each compute node should receive a request (via "rocks run host") to print its name to standard output ("hostname") which is executed via GNU Parallel.  My cluster happens to be missing compute-0-3, and has a permanently dead node called compute-0-0, which explains the weirdness in the output above.


3. A more complex test.  Here we want to parallelize a simple subroutine across all nodes in the cluster.  We need a file that names all nodes we want to send jobs to.  Let's call it ~/machines. It should look something like:

cat ~/machines
compute-0-1
compute-0-2
compute-0-4
compute-0-5

In the BASH script below, the simple subroutine ("subr()") creates a couple variables ("a" and "b") from the arguments sent to it by GNU Parallel.  It then echoes those variables plus the hostname of the node running it.

subr() needs to be exported to the shell so that GNU parallel can access it ("export -f subr") from the various independent shells it creates.

The "parallel" command is assembled as follows:
parallel #call to the executable, you may need to use the full path ~/bin/parallel
--env subr #pass the exported subroutine to the new shell
--sshloginfile machines #path to the machines file with list of node, note this assumes passwordless ssh, the norm on Rocks clusters.
--jobs 24 #the number of cores available on the compute node
subr #a call to the subroutine
::: $a ::: $b #GNU Parallel syntax to manage the variable lists


4. Paste the following into the terminal:

subr() {
a=$1;
b=$2;
echo -n $a $b" ";
hostname;
}
export -f subr; #necessary for gnu parallel to work

a="1 2 3";
b="x y z";
parallel --env subr --sshloginfile machines --jobs 24 subr ::: $a ::: $b;

The output should look something like:
1 x compute-0-1.local
1 y compute-0-5.local
1 z compute-0-4.local
2 y compute-0-1.local
2 x compute-0-2.local
2 z compute-0-5.local
3 z compute-0-1.local
3 x compute-0-4.local
3 y compute-0-2.local

So each pairwise combination of variables has been "echoed" precisely one time, using the nodes specified in the "machines" file, as controlled by GNU Parallel. Go Ole!

21.7.15

Enable client-side smart card authenticated ssh on Mac OS 10.6, 10.7, and 10.10

Problem: The ssh command on some Macs does not support smartcard authenticated ssh.  The following error appears when using the -I option:
"no support for PKCS#11."

Solution: Install a 'portable' version of openssh.  This is a very brief, slightly modified, rendition of the excellent tutorial at http://www.gooze.eu/howto/using-openssh-with-smartcards.  Every system seems to be a little bit different, this worked for me. Thrice. Skip steps 3 and 4 for Yosemite (10.10)

1. Install tools to read cards:
brew install opensc

2. Test card reader:
pkcs15-tool --list-public-keys

3. Download openssh5.5p1 (maybe you want a newer one, this is what I tested):
curl http://ftp3.usa.openbsd.org/pub/OpenBSD/OpenSSH/portable/openssh-5.5p1.tar.gz > openssh-5.5p1.tar.gz
tar -xzvf openssh-5.5p1.tar.gz

4. Install:
./configure --prefix=/usr/local/bin/openssl --without-openssl-header-check
make
make install

5. Test (you may have to change the path to the brew-installed opensc libraries):
/usr/local/bin/openssl/bin/ssh -I /usr/local/Cellar/opensc/0.14.0/lib/opensc-pkcs11.so login@xxx.xxx.xxx.xxx

6. Modify $PATH, placing /usr/local/bin/openssl/bin before /usr/bin (or wherever your 'old' ssh is)

7. Open new terminal window, verify that you are now using the 'new' ssh:
which ssh

8. To avoid having to type the -I option, and automatically use the card reader, add this line:
PKCS11Provider /usr/local/Cellar/opensc/0.14.0/lib/opensc-pkcs11.so
to:
/usr/local/bin/openssl/etc/ssh_config

9. Connect: 
ssh login@xxx.xxx.xxx.xxx

You can still use your 'old' ssh by including the full path name /usr/bin/ssh.




24.6.15

Install 'screen' from CentOS base repository on Rocks 5.5 cluster using 'yum'

Problem:
On my install of Rocks 5.5 running CentOS 5.8, there was no working 'screen' command.  You need 'screen'. Really. Following the few threads out there I became mired in the exasperating process of adding a yum repository to Rocks.  Don't be like me, don't follow the instructions at:
http://central6.rocksclusters.org/roll-documentation/base/6.1/update.html
This particular problem is much easier to solve.  The repo is already there, just not enabled.

Solution:
1. Determine which repositories, already installed on your cluster, are enabled:

yum repolist all

For me, the only enabled repository was Rocks-5.5. 'Screen' is not found there, so that when you run:

yum install screen

you get:

Setting up Install Process
No package screen available.
Nothing to do


2. Enable the CentOS base repository:

cd /etc/yum.repos.d
cat CentOS-Base.repo

note that:
...
[base]
enabled = 0
...

change that:

cp CentOS-Base.repo CentOS-Base.repo.old          #preserve the initial file
perl -0pe 's/\[base\]\nenabled = 0/\[base\]\nenabled = 1/' CentOS-Base.repo > CentOS-Base.repo.new

cat CentOS-Base.repo.new

note that now:
...
[base]
enabled = 1
...

rename new file to original file name:

mv CentOS-Base.repo.new CentOS-Base.repo

3. Verify that CentOS base repository is now enabled:

yum repolist all

4. Proceed with normal installation process using yum:

yum install screen

...hooray, screen.

21.5.15

Make lower triangle matrix from string using bash

This is sometimes useful for formatting a genetic distance matrix from input data of unspecified format. Yes, it is an ugly cludge.

val="1 2 3 4 5 6 7 8 9 10 11 12 13 14 15";
j=1; k=0; v=$(echo $val | wc -w | awk {'print $1'});
r=$(printf "%.0f" $(echo "sqrt(2*$v+(1/4))-(1/2)" | bc -l));
for ((i=1;i<=$r;i++));
do m=$(($j+$k));
j=$(($j+$k)); k=$(($k+1));
n=$(printf "%.0f" $(echo "($k+1)*($k/2)" | bc -l));
echo $val | cut -d' ' -f$m-$n;
done


1
2 3
4 5 6
7 8 9 10
11 12 13 14 15

29.1.15

Calculate generalized variance using R, launched from bash or Applescript

Problem #1: How to calculate the generalized variance.  This seldom used statistic of dispersion expresses the variance in a multivariate data set.  It is calculated as the determinant of the variance-covariance matrix.

One solution: Use R.  Here is a script:

#!/usr/bin/Rscript
C<-matrix(c(49.4112,16.2815,46.4447,4.11667,54.4,-3),3,2,byrow=T)
det(cov(C))

This is a set of three latitude/longitude pairs (to help you visualize the data input structure).  Execute from a bash shell by entering the file path.


Problem #2: How to launch from Applescript.

One solution:

--1. generate R script file
set rscript to (open for access ("rscript.r") with write permission)
set eof rscript to 0

write ("#!/usr/bin/Rscript
C<-matrix(c(49.4112,16.2815,46.4447,4.11667,54.4,-3),3,2,byrow=T)
cat(det(cov(C)))
")  to rscript
close access rscript

--2. execute R script, retrieve generalized variance
do shell script ("chmod u+x /rscript.r")
set genvar to (do shell script ("/rscript.r"))
display dialog genvar

Just two very minor tricks here. First, enclose det(cov(C)) in a cat command. This will eliminate any extraneous output from R. Second, Applescript creates files with limited permissions, so you have to chmod to allow execute.