sublimated

Tag: random

Shuffled standard input or shuffled files

Every now and then I like to shuffle a file or a directory. Here is a trivial shuffle script in Ruby:

#!/usr/bin/ruby

# output randomly shuffled lines from the file passed as an argument
# or input fed to standard input
# ex: ls | shuffle.rb 
# This produces a randomly shuffled directory

# a modified version of the code from http://bit.ly/dUdAb9
# shuffle modified from "Programming in Ruby" by Thomas and Hunt

# get the lines:
if ARGV.size == 1
  lines = IO.readlines(ARGV[0])
elsif not ARGF.eof?
  lines = ARGF.readlines
else
  abort "usage: shuffle.rb <file>"
end

# pick a random line, remove it, and print it
lines.size.times do
  print lines.delete_at(rand(lines.size))
end
Advertisement

Online Textbooks: Downloading and Merging Multiple Chapters

Sometimes clueful authors provide PDF copies of their texts for free. Oft-times these same folks link a separate PDF for each chapter, which is convenient for browsing but annoying if you want to copy the whole text for off-line browsing and reference.

Case in point, Cain has a great list of online math texts. He has also co-authored with Herod a nice Multivariable Calculus text.

An elementary programming exercise: download each chapter as quickly as possible and assemble them into a single PDF. My own answer:


# grab the PDFs linked from the book's webpage
wget -r --no-directories --no-parent -A.pdf http://people.math.gatech.edu/~cain/notes/calculus.html
# join the PDFs
pdftk *.pdf cat output multivariable-calculus.pdf

How many physical cores on Linux

Suppose you want to write a script which parallelizes building a large program. You could just assume that everyone has dual-core, but that would saturate a single core and under-utilize a hex core. A smarter method would be to set N in make’s -jN to be proportional to the number of physical cores in your computer. Over at how-to geek there is a script which gets close. However, when you have a hyperthreaded processor, their technique will over-report the number of actual cores. Here is a slightly more complex script which will extract the number of cores from cpuinfo:

grep "$cpu cores" /proc/cpuinfo | head -n 1 | cut -d ' ' -f 3

If you execute this on a Core i7 the script finds the actual number of physical cores:

$ grep "$cpu cores" /proc/cpuinfo | head -n 1 | cut -d ' ' -f 3
4

Getting some commas into those spaces

Here is a troglodytic programming task I have brushed passed so often that I feel there ought to be a canned script to solve it. The problem is: you have a file with text separated by arbitrary whitespace and you want to covert it to a comma separated value (CSV) file. In short you need a space to CSV converter. By way of example, suppose you have:

field1 field2    field3
field4 field5 field6

But you really need to feed R, MATLAB, or Excel:

field1,field2,field3
field4,field5,field6

You have ten seconds, tell me in which language you’d solve that. My answer is “no language” … otherwise known as a BASH script wrapper for tr:

#!/bin/sh

# Converts the text file with fields seperated by whitespace named in
# the first argument into a CSV file for output named by the second
# argument.

# First test that two arguments were passed

ARGS=2        # Number of arguments expected.
E_BADARGS=65  # Exit value if incorrect number of args passed.

test $# -ne $ARGS && \
echo "Usage: `basename $0` $ARGS argument(s)" && \
echo "Example: `basename $0` inputfile outputfile" && \
exit $E_BADARGS

# Next convert whitespace to commas for CSV

tr -s '[:blank:]' ','  $2

Please download space-to-csv.sh, a small tool for a small problem.

Player/Stage on OS X 10.5 with Macports

Below is a quick recipe for installing and testing Player/Stage 2.0.4 on OS X 10.5 Leopard using Macports. First, if you have not already, install Macports. Once Macports is installed, from the terminal execute:

sudo port install playerstage-stage playerstage-player

At this point, I tried to launch Player/Stage, but received errors about rgb.txt:

unable to open color database /usr/X11R6/lib/X11/rgb.txt :
No such file or directory (stage.c stg_lookup_color)

To correct this problem, player/stage needs a link to the X11 color map in the place it expects:

sudo ln -s /usr/X11/share/X11/rgb.txt /usr/X11R6/lib/X11/rgb.txt

Now you are ready to launch and test player/stage:

player /opt/local/var/macports/software/playerstage-stage\
/*/opt/local/share/stage/worlds/simple.cfg

If you are sucessful, you should see the following window. You can drive player/stage using an example program in another terminal:

/opt/local/var/macports/software/playerstage-player\
/*/opt/local/share/player/examples/libplayerc++/laserobstacleavoid

After playing around a bit, I am sure you will agree that player/stage is the best thing going right now for open source robot simulation, control, and development.

Keyboard quirks on the Japanese Macbook Pro

Xmodmap for Japanese Macbook Pro Keyboards
So, you are in the unlikely position of using a Japanese model Macbook Pro with X11 and you notice that the keyboard is all wacky. It seems like they shipped the English modmap, which is pretty inconvenient since it doesn’t map the keys correctly.

Over on this blog I found a modmap which partly alleviates the problem. However, I still had some problems on my notebook, in particular the shift keys don’t work correctly. I decided this was because the file maps a few too many keys to nothing at all. To remedy this, I used sed to remove the lines ending with equal signs:

sed '/=$/d' Xmodmap.broken > .Xmodmap

The resulting .Xmodmap, placed in my home directory works well. This is on OS X Leopard, with a late 2006 model Macbook Pro.

Emacs, ¥, 円, and Backslash
Another annoyance is that Emacs (in particular Aquamacs) seems to always insert Yen symbols. The terminal is set up such that a press of the ¥ key causes a \ to be input. This is convenient in that backslash is used frequently in many Unix-style programs such as latex and bash.

After some experimenting, I found that if you add the following line to your .emacs file, the ¥ key will emit \ just like the terminal:

(global-set-key [2213] "\\\\")

This might save a lot of headaches if you enter backslash more frequently then yen. And of course the symbol most commonly used in Japan for Yen is 円.

Not enough random bytes available

As a periodic user of gpg (by way of emacs’s crypt++) I sometimes encounter the somewhat entertaining “Not enough random bytes available” message.

Suppose I am wanting a new key and thusly fire up gpg:

gpg --gen-key

After entering the usual information I come face to face with:

We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
+++++.+++++++++++++++++++++++++++++++++++++++++++++++++++++++.++++++++++++
+++.+++++..+++++.++++++++++++++++++++..+++++++++++++++..+++++.++++++++++>+
++++...+++++

Not enough random bytes available.  Please do some other work to give
the OS a chance to collect more entropy! (Need 283 more bytes)

On first encounter with this dialog, I thought: “they are joking right?” But it turns out the key generator is not joking at all.

Depending on how many bits you chose for your keysize, you might be waiting for quite some time (perhaps hours, maybe days). So I found myself asking (1) how can I gain entropy (2) how much entropy is available.

On Linux at least, gpg uses /dev/random as a source of high quality random bits. Word on the street is that random gets its high quality bits by transforming interrupt events.

So on the first note (how to gain entropy) it seems like you can generate some interrupts by using the keyboard, disk, or network. A good way to do this is to use your computer (downloading and compiling are really good activities). Alternatively, you can install a user space entropy gathering system like EGD. Or, if you are really a key-generating addict, you can get some special hardware.

On the second note (how much entropy are you gaining) on Linux you can watch the entopy pool by observing the appropriate spot in the /proc system:

watch cat /proc/sys/kernel/random/entropy_avail

When that number goes up, you are doing the right thing. Eventually, you will get some more ascii noise indicating that gpg is making progress. Finally you should see something similar to:

+++++...++++++++++..++++++++++++++++++++..++++++++++.....+++++..++++++++++
++++++++++++++++++++++++++++++.+++++.++++++++++++++++++++.+++++..+++++++++
+++++++++++.+++++...+++++.+++++++++++++++>+++++.+++++++++++++++++++++++++.
++++++++++.++++++++++>+++++...............>+++++.............+
++++.+++++..................+++++^^^^^^^^^

gpg: key 2BC5527E marked as ultimately trusted
public and secret key created and signed.

gpg: checking the trustdb
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0  valid:   3  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 3u
gpg: next trustdb check due at 2008-08-27
pub   1024D/2BC5527E 2007-08-28 [expires: 2008-08-27]
      Key fingerprint = 1445 DE3C 3F54 CD3E BB48  3B1C 516D F284 2BC5 527E
uid                  Carson Reynolds 
sub   4096g/B4E02D04 2007-08-28 [expires: 2008-08-27]

Anyway, I hope if you encounter the “Not enough random bytes available.” message, you can use this post to figure out what it’s about and how to reduce your waiting time.

Climbing grades compared

Below is a comparison of climbing grades that includes the Ogawayama (kyuu / dan) system used in Japan:

Climbing Routes

Bouldering

French UK Australia UIAA USA Hueco UK Font  Japan
1-2 HVD 8-9 I-II 5.2-5.3
2-3 MS 10-12 III 5.4-5.5
4 S 13- IV 5.6
4+ VS 13+ V- 5.7
5a   14 V 5.8
5b HVS 15 V+ 5.9 V0 B1 4 7 kyu
6a E1 5b 19 VI+ 5.10a V0+ B2 4+ 6 kyu
6a+ E2 5c 19 / 20 VI+/VII- 5.10b
6b   20 VII 5.10c V1 B3 5 5 kyu
6b+ E3 5c 21 VII+ 5.10d 4 kyu
6c   21 / 22 VII+/VIII- 5.11a V2 B4 6a
6c+ E4 6a 22 VIII- 5.11b V3 B5-6 6a+ 3 kyu
7a   23 VIII 5.11c/d V4 6b/c 2 kyu
7a+ E5 6b 24 VIII/VIII+ 5.12a
7b   25 VIII+ 5.12b V5 6c+ 1 kyu
7b+ E6 6b 26 IX- 5.12c V6 B7 7a
7c   27 IX 5.12d
7c+   28 IX/IX+ 5.13a V7 B8 7a+ 1 dan
8a E7 6c 29 IX+ 5.13b V8 B9 7b+ 2 dan
8a+   30 X- 5.13c V9 7c
8b E8 7a 31 X 5.13d
8b+   32 X/X+ 5.14a V10 B10 7c+ 3 dan
8c E9 7b 33 X+ 5.14b V11 B11 8a
8c+   34 XI- 5.14c V12/13 B12 8a+ 4 dan
9a E10 7c 35 XI 5.14d/5.15

This was created by merging Jonas Wiklund’s and Mountain Equipment’s guides. These things are always subjective; so take care in using it as a reference. Nearly every area’s grades/ratings/rankings disagree, sometimes severely.

Working mimeTypes.rdf for M3Uitunes and Firefox 2 on OS X

Suppose that you want to listen to and M3U steam on OS X with iTunes and Firefox 2.0. The best way I know of going about this is M3U2iTunes. Unfortunately, Firefox 2.0 does not allow users to edit which application handles M3Us using any simple method. As far as I understand, one must edit the mimeTypes.rdf file located in your Firefox profile folder. A custom mimeTypes.rdf works for me with OS X 10.4, iTunes 7, and Firefox 2.0. Try the following:



  
  
    
  
  
    
  
  
    
  
  
    
  

Anti-Filter Iran

Khush amadeed

A lot of traffic to my weblog is related to an article I wrote called The Anti-Filter. Most of these folks are from Iran and are looking technology to by-pass web filters and censorship. I thought I would more directly provide some instructions on how to anti-filter.

Step 1: find a proxy.
The quickest is to use a browser-based proxy. Good starting points are:

If you want something more permanent, try one of these programs:

Step 2: Enter URL (for example http://www.orkut.com) into proxy application.

Step 3: Browse through filter, firewalls, etc.

Some countries may block these websites. A smart way to find a proxy you can access is to check one of these links:

Hope this helps and is understandable. If someone would like to translate into Farsi, I would be happy to post a translation. With some luck this will help you anti-filter and will remain unblocked.

Update: I have disabled comments on this article. So many people replied with thanks that it broke the programming language used on this blog. Read the instructions above carefully and you will find how to avoid filtering.