?

Log in

  [entries|friends|calendar]
David Oftedal

[ website | My Website ]
[ userinfo | livejournal userinfo ]
[ calendar | livejournal calendar ]

[28 Jun 2016|06:34pm]
post comment

놀데 없니? [10 Apr 2016|11:46am]
post comment

[24 Mar 2016|11:01am]
One of the algorithms I've had fun using as a programming exercise over the years is a method published by V-Rtifacts for converting anaglyph 3D images, the kind that requires coloured glasses, into regular 3D images.

Start out with an anaglyph image, such as this one, called "Flusslandschaft" by Stereotreff:
Flusslandschaft by Stereotreff

Separate the red and green+blue colour channels to get usable black and white versions of the left and right frames:
Greyscale frame

Blur the original image to remove the red/cyan fringing and get the colours to roughly fit both frames:
Blurred colours

...And re-apply the colours to the black-and-white frames in order to get one 2D image for each eye:
Blurred colours

Quite a clever method!

On some images with a lot of small, brightly-coloured details, the colours tend to get washed out. To remedy this, I've made my versions of the algorithm use a selective blur. Instead of blurring the entire image, the blur is only applied where it would produce a large difference between the left and right frame. One estimate of this is whether it would change the ratio between red and cyan in the pixel.

With that method, the colours look like this:
Selectively blurred colours

And the green leaves in the final image are now much more vivid:
Final image

One of my implementations with more examples and tweakable parameters can be found here for the curious. All in all, this is one of my favourite algorithms, because for its relative simplicity and the complexity of the task, it produces quite an impressive result.
1 comment|post comment

[06 Mar 2016|12:28pm]
I've got this song stuck in my head today. You win this time, YouTube!

post comment

[12 Apr 2015|08:17pm]
I might just have found the saddest thing.

1 comment|post comment

D-- [21 Feb 2015|08:30pm]
My previous experiment with minimizing code proved very satisfactory, but I was left with a nagging feeling at the back of my mind — Isn't there a language we're forgetting in all this? A language that is a delight to the eye, rational and concise, fast and flexible, and at the same time a fountain of expressiveness? Yes — Aren't we forgetting C?

Of course! And so, I set out to port the same script, found here and here and here, to that most auspicious of languages. The code you see at the bottom of each section is the result, and it can also be found here.

#include <stdio.h>. How nostalgic.
#!/usr/bin/env ruby

# Solution to Project Euler problem 61
# Author: David J. Oftedal.
#!/usr/bin/dmd -run

// Solution to Project Euler problem 61
// For a more readable version, see 061.rb
// Author: David J. Oftedal.

import std.stdio, std.math, std.algorithm, std.string, std.conv, std.c.stdlib, std.range;
// Solution to Project Euler problem 61
// Author: David J. Oftedal.

#include <stdio.h>
#include <math.h>

I'm fairly certain the ABC formula has a name stemming from its discoverer or some other mathematician, but I think "ABC" is rather nice. Here I let the C version just return the largest result, saving me the effort of returning an array of results and making a max() function for it.
# The ABC formula for solving quadratic equations (where the solutions are real numbers).
def abc(a, b, c)
  return (-b+Math.sqrt(b**2.0-(4.0*a*c)))/(2.0*a),(-b-Math.sqrt(b**2.0-(4.0*a*c)))/(2.0*a)
end
// The ABC formula for solving quadratic equations (where the solutions are real numbers).
// Restrict to integer solutions.
int[] abc(int A, int B, int C) {
  real a = to!real(A), b = to!real(B), c = to!real(C);
  return [to!int((-b+sqrt((b^^2.0)-(4.0*a*c)))/(2.0*a)),to!int((-b-sqrt((b^^2.0)-(4.0*a*c)))/(2.0*a))];
}

int max(int[] ints) {
  return std.algorithm.max(ints[0], ints[1]);
}
// The ABC formula for solving quadratic equations (where the solutions are real numbers).
// Restrict to integer solutions and return the maximum solution only.
int abc(int a, int b, int c) {
  return (-b+sqrt((b*b)-(4.0*a*c)))/(2.0*a);
}

In Ruby and D, I defined several named methods for returning polygonal numbers and their roots. Originally, this was meant to improve readability and composability of the code, but with eval() and mixin(), that kind of went out the window. Who needs all those methods, anyway? One method should be enough for anyone. So in C, we have one.
$polygonalTypes = ["triangular", "square", "pentagonal", "hexagonal", "heptagonal", "octagonal"]

class Integer
  # Return the triangular, square, pentagonal, etc. number self.
  # The methods are called triangular, square, etc.
  $polygonalTypes.each_with_index do |name, i|
    eval("def #{name}
      (((#{i}+1)*self**2)-((#{i}-1)*self))/2
    end")
  end
immutable string[6] polygonalTypes = ["triangular", "square", "pentagonal", "hexagonal", "heptagonal", "octagonal"];

// Return the triangular, square, pentagonal, etc. number n.
// The methods are called triangular, square, etc.
string definePolygonals() {
  string Polygonals = "";
  foreach(int i, string name; polygonalTypes)
    Polygonals ~= "int "~name~"(int n) {
      return ((("~to!string(i)~"+1)*n^^2)-(("~to!string(i)~"-1)*n))/2;
    }";
  return Polygonals;
}
mixin(definePolygonals());
enum polygonalTypes {triangular, square, pentagonal, hexagonal, heptagonal, octagonal, none};

// Return the triangular, square, pentagonal, etc. number n.
int polygonalNumber(int n, int polygonalType) {
  return (((polygonalType+1)*n*n)-((polygonalType-1)*n))/2;
}

That was such a success that I did the same thing with the roots.
  # Return the triangular, square, etc. root of self.
  # The methods are called triangular_root, square_root, etc.
  $polygonalTypes.each_with_index do |name, i|
    eval("def #{name}_root
      return 0 if self < 1
      abc(#{i}+1, -(#{i}-1), -(2*self)).max
    end")
  end
// Return the triangular, square, etc. root of n.
// The methods are called triangular_root, square_root, etc.
string definePolygonalRoots() {
  string PolygonalRoots = "";
  foreach(int i, string name; polygonalTypes)
    PolygonalRoots ~= "int "~name~"_root(int n) {
      if (n < 1) return 0;
      return max(abc("~to!string(i)~"+1, -("~to!string(i)~"-1), -(2*n)));
    }";
  return PolygonalRoots;
}
mixin(definePolygonalRoots());
// Return the triangular, square, etc. root of n.
int polygonalRoot(int n, int polygonalType) {
  if (n < 1) return 0;
  return abc(polygonalType+1, -(polygonalType-1), -2*n);
}

The next_ methods have also become one nextPolygonal() method.
  # Return the next triangular, square, etc. number after self.
  # The methods are called next_triangular, next_square, etc.
  $polygonalTypes.each do |name|
    eval("def next_#{name}
      (#{name}_root.to_i + 1).#{name}
    end")
  end
end
// Return the next triangular, square, etc. number after n.
// The methods are called next_triangular, next_square, etc.
string defineNextMethods() {
  string nextMethods = "";
  foreach(string name; polygonalTypes)
    nextMethods ~= "int next_" ~ name ~ "(int n) { return " ~ name ~ "(" ~ name ~ "_root(n) + 1); }\n";
  return nextMethods;
}
mixin(defineNextMethods());
// Return the next triangular, square, etc. number after n.
int nextPolygonal(int n, int polygonalType) {
  return polygonalNumber( polygonalRoot(n, polygonalType) + 1, polygonalType);
}

My home rolled C enumerable API is quite handy: One struct with two fields.
# Generators that will yield infinite streams of triangular, square, pentagonal, etc. numbers.
$polygonalTypes.each do |name|
  eval("class #{name.capitalize}Generator
    include Enumerable

    # Start at the number equal to or larger than n.
    def initialize n
      @current = (n-1).next_#{name}
    end

    def each
      loop do
        yield @current
        @current = @current.next_#{name}
      end
    end
  end")
end
// Return the next triangular, square, etc. number after n.
// The methods are called next_triangular, next_square, etc.
string defineNextMethods() {
  string nextMethods = "";
  foreach(string name; polygonalTypes)
    nextMethods ~= "int next_" ~ name ~ "(int n) { return " ~ name ~ "(" ~ name ~ "_root(n) + 1); }\n";
  return nextMethods;
}
mixin(defineNextMethods());

// Generators that will yield infinite streams of triangular, square, pentagonal, etc. numbers.
string defineGenerators() {
  string generators = "interface NumberGenerator {
    @property bool empty();
    @property int front();
    NumberGenerator set(int);
    void popFront();
    NumberGenerator takeMax(int);
  }
  ";

  foreach(string name; polygonalTypes)
    generators ~= "class "~capitalize(name)~"Generator : NumberGenerator {

      int current = 1;
      int maxValue = int.min;
      bool isEmpty = false;

      @property bool empty() {
        return isEmpty;
      }

      @property int front() {
        return current;
      }

      "~capitalize(name)~"Generator takeMax(int n) {
        maxValue = n;
        return this;
      }

      // Start at the number equal to or larger than n.
      "~capitalize(name)~"Generator set(int n) {
        current = next_"~name~"(n-1);
        isEmpty = false;
        return this;
      }

      void popFront() {
        current = next_"~name~"(current);
        if(maxValue != int.min && current > maxValue) isEmpty = true;
      }
    }";
  return generators;
}
mixin(defineGenerators());
// Generator that will yield infinite streams of triangular, square, pentagonal, etc. numbers.
typedef struct {
  int current;
  int polygonalType;
} numberGenerator;

Round about here I remembered why I used enumerables or ranges in the first place. Writing tons of loops is old hat, though, so why not a recursive function? You know how they're so easy to debug. This one proved no exception.
# Find a circular chain of six four-digit numbers, each of a different polygonal type, where the first two digits of each number equal the last two digits of the previous one.
if $0 == __FILE__

  # Chain together the six streams of polygonal numbers in all possible orders.
  polygonals = eval("[" << $polygonalTypes.map{|name| name.capitalize + "Generator"}.join(", ") << "]")

  # Upon finding a number that may yield a valid chain, create a new generator and search for the second number in that chain, and if found, a third generator, etc.
  # Continue until a chain with six valid numbers is found.
  code = "num0 = 1010\n"
  (1..6).each do |i|
    code += "polygonal#{i.to_s}.new((num#{(i-1).to_s}.to_s[2..3] + \"10\").to_i).take_while{|n| n < 10000 && ("+i.to_s+" == 1 ||
    num#{(i-1).to_s}.to_s[2..3] == n.to_s[0..1])}.select{|n| "+i.to_s+" < 6 || num1.to_s[0..1] == n.to_s[2..3]}.each do |num#{i.to_s}|\n"
  end

  # There's only one chain of six numbers matching the criteria. As soon as the generators generate one, display the chain and exit.
  code += "puts "
  (1..6).each {|i| code += "num"+i.to_s+".to_s+\" "+(i<6 ? "+ \"+" : "= \"+
  [num1,num2,num3,num4,num5,num6].inject(:+).to_s\n
  exit\n")}
  code += ("end\n" * 6)

  polygonals.permutation.each do |polygonal1, polygonal2, polygonal3, polygonal4, polygonal5, polygonal6|
    eval(code)
  end
end
// Find a circular chain of six four-digit numbers, each of a different polygonal type, where the first two digits of each number equal the last two digits of the previous one.
string defineFindChain() {
  string code = "void findChain(";
  for(int i=1;i<7;i++) code ~= "NumberGenerator polygonal"~to!string(i)~(i<6 ? "," : ") {\nint num0 = 1010;\n");

  // Upon finding a number that may yield a valid chain, create a new generator and search for the second number in that chain, and if found, a third generator, etc.
  // Continue until a chain with six valid numbers is found.
  for(int i=1;i<7;i++)
    code ~= "foreach(int num"~to!string(i)~"; filter!(n => "~to!string(i)~" == 1 || (("~to!string(i)~" < 6 || to!string(num"~to!string(std.algorithm.max(i-5, 0))~")[0..2] == to!string(n)[2..4]) && to!string(num"~to!string(i-1)~")[2..4] == to!string(n)[0..2]))(polygonal"~to!string(i)~".set(to!int(to!string(num"~to!string(i-1)~")[2..4] ~ \"10\")).takeMax(9999))) {\n";

  // There's only one chain of six numbers matching the criteria. As soon as the generators generate one, display the chain and exit.
  code ~= "writefln(";
  for(int i=1;i<7;i++) code ~= "to!string(num"~to!string(i)~")~\" "~(i<6 ? "+ \"~" : "= \"\n~to!string(reduce!((a,b) => a+b)(0, [num1,num2,num3,num4,num5,num6])));
  exit(0);");
  for(int i=0;i<7;i++) code ~= "}";

  return code;
}
mixin(defineFindChain());

string defineMain() {
  string main = "int main(string[] argv) {
  // Chain together the six streams of polygonal numbers in all possible orders.
  NumberGenerator[6] polygonals = new NumberGenerator[6];\n";
  foreach(int i, string name; polygonalTypes) main ~= "\tpolygonals["~to!string(i)~"] = new "~capitalize(name)~"Generator();\n";
  main ~= "\tint[] indices = [0, 1, 2, 3, 4, 5];
  do {
    findChain(";
  for(int i=0;i<6;i++) main ~= "polygonals[indices["~to!string(i)~(i<5 ? "]]," : "]]);
  } while(nextPermutation(indices));
  return 0;
  }");
  return main;
}
mixin(defineMain());
// Find a circular chain of six four-digit numbers, each of a different polygonal type, where the first two digits of each number equal the last two digits of the previous one.
void findChain(int polygonalTypes[], int numbersFoundCount, int numbersFound[]) {

  for(int polygonalType = 0; polygonalType < 6; polygonalType++) {
    if(polygonalTypes[polygonalType] == none) continue;
    
    numberGenerator generator;
    generator.polygonalType = polygonalTypes[polygonalType];

    int previousNumber = (numbersFoundCount == 0 ? 1010 : numbersFound[numbersFoundCount - 1]);
    generator.current = ((previousNumber % 100) * 100) + 10;
    generator.current = nextPolygonal(generator.current - 1, generator.polygonalType);
    int newNumber = -1;
    while(numbersFound[5] == -1) {

      newNumber = generator.current;
      generator.current = nextPolygonal(generator.current, generator.polygonalType);

      if(newNumber > 9999) break;
      else if((newNumber / 100) < 10) continue;
      else if(numbersFoundCount > 0 && (newNumber / 100) != (previousNumber % 100)) continue;
      else if(numbersFoundCount == 5 && (numbersFound[0] / 100) != (newNumber % 100)) continue;
      else {
        numbersFound[numbersFoundCount] = newNumber;

        if(numbersFoundCount < 5) {
          
          // Upon finding a number that may yield a valid chain, search for the second number in that chain, and if found, the third, etc.
          // Continue until a chain with six valid numbers is found.
          int remainingTypes[6];
          for(int j = 0; j < 6; j++) {
            remainingTypes[j] = (j == polygonalType ? none : polygonalTypes[j]);
          }
          findChain(remainingTypes, numbersFoundCount+1, numbersFound);
        
        } else {
          // Return when a chain is found.
          return;
        }
      }
    }
  }
}

int main() {
  int polygonalTypes[] = {triangular, square, pentagonal, hexagonal, heptagonal, octagonal};
  int numbersFound[] = {-1, -1, -1, -1, -1, -1};
  findChain(polygonalTypes, 0, numbersFound);

  // There's only one such chain. Display the result and exit.
  printf("%d + %d + %d + %d + %d + %d = %d\n", numbersFound[0], numbersFound[1], numbersFound[2], numbersFound[3], numbersFound[4], numbersFound[5],
  numbersFound[0] + numbersFound[1] + numbersFound[2] + numbersFound[3] + numbersFound[4] + numbersFound[5]);

  return 0;
}

The code speaks for itself. There's something almost majestic about the big lump of code that iterates through all those numbers. A size or speed comparison would obviously be somewhat meaningless, as the C version had both lower-level code and fewer abstractions from the outset.

I also suspect that D could be made to inline the entire program and run it in zero time each time, by way of some well-placed mixin() calls around the functions that initiate the program and display output. Be that as it may, the C code turned out to be a lot more concise than I'd expected it to be. Perhaps all those web developers just haven't given it a fair chance.
2 comments|post comment

[12 Jun 2014|10:20pm]
After Andrei Alexandrescu kindly shared the previous post on D online, Redditor PhilippeSigaud pointed out a few things worth mentioning:

Uniform Function Call Syntax
Instead of allowing monkey patching of classes, D has something called Uniform Function Call Syntax, whereby methods that take a certain type as their first parameter can also be called as member functions of that type. Not entirerly unlike Extension Methods in C♯. So writing 3.pentagonal in _061_compact_d.d would have been perfectly alright. Here is an example using UFCS:

import std.stdio;

int square(int n) {
	return n*n;
}

void main() {
	writefln("%d", square(3));
}
import std.stdio;

int square(int n) {
	return n*n;
}

void main() {
	writefln("%d", 3.square());
}

A saner way of calling mixin
The same Redditor points out that the double mixins are not necessary. While mixin("defineMain();") doesn't work and mixin("mixin(defineMain());"); does, the much simpler mixin(defineMain());, which calls defineMain() directly, also works like a charm. This construct doesn't compile the call to defineMain() from a string, but instead calls defineMain() directly and compiles the returned string. Although I still think the double mixin is hilarious, I made a new _061_d_v2.d to use this more proper construct.

string definePolygonals() {
	string Polygonals = "";
	foreach(int i, string name; polygonalTypes)
		Polygonals ~= "int "~name~"(int n) {
			return ((("~to!string(i)~"+1)*n^^2)-(("~to!string(i)~"-1)*n))/2;
		}";
	return Polygonals;
}
mixin("mixin(definePolygonals());");
string definePolygonals() {
	string Polygonals = "";
	foreach(int i, string name; polygonalTypes)
		Polygonals ~= "int "~name~"(int n) {
			return ((("~to!string(i)~"+1)*n^^2)-(("~to!string(i)~"-1)*n))/2;
		}";
	return Polygonals;
}
mixin(definePolygonals());

More elegant ranges
Redditor MrJNewt makes two interesting suggestions about a more elegant way of implementing the ranges using templates and std.range.recurrence or std.range.sequence.

More interesting comments might pop up on the D Programming Language forums, Reddit, Facebook and/or Twitter.
2 comments|post comment

代碼高爾夫 [08 Jun 2014|10:16pm]
One thing Andrei Alexandrescu mentioned in his recent talk about D that caught my attention in particular, was its inclusion of mixin; a function that allows code to be invoked at compile-time that defines values, and actually other code as well. In other words, mixin is somewhat like a compile-time eval, that allows a fairly large subset of code to be defined in the code itself, and still be compiled in.

This got me thinking of a script I wrote in Ruby to solve Project Euler Problem 61, a problem which involves chaining together several types of polygonal numbers which share certain digits, so that they form a circular chain. Polygonal numbers are analogous to square numbers, only with polygons other than squares. For instance, the square of 2 is 4, and triangular number 2 is 3.

To solve this problem, I used Ruby's Enumerable Module to iterate through and filter the numbers instead of regular loops, as it's somewhat easy to chain together different filters using that syntax. It's also similar to the IEnumerable interface in C♯. Naturally, I got to wondering if D has a similar device, and it does, after a fashion.

Moreover, as I was solving the problem, the code got more and more repetitive, so I turned it into an exercise in abusing eval to create compact, self-generating code. It proved an irresistible challenge to see if this could also be done in D. The answer is below.

There are three versions of the program. At the top of each section is the most readable version, from 061.rb. In the middle is the more compact Ruby version, found in 061-compact.rb. Finally, the resulting D version, also found in _061_compact_d.d, is at the bottom.

Starting off gently with the headers.
#!/usr/bin/env ruby

# Solution to Project Euler problem 61
# Author: David J. Oftedal.
#!/usr/bin/env ruby

# Solution to Project Euler problem 61
# For a more readable version, see 061.rb
# Author: David J. Oftedal.
#!/usr/bin/dmd -run

// Solution to Project Euler problem 61
// For a more readable version, see 061.rb
// Author: David J. Oftedal.

import std.stdio, std.math, std.algorithm, std.string, std.conv, std.c.stdlib, std.range;

Next up is a helper function for solving quadratic equations. This is useful for computing the inverse of many of the functions below.
In the D version, I've restricted the function to integer parameters and return values, as that is what I'm using it for in this program.
I've also added a little max function, that makes it convenient to get the maximum element from an array of two elements.
# The ABC formula for solving quadratic equations (where the solutions are real numbers).
def abc(a, b, c)
	return (-b+Math.sqrt(b**2.0-(4.0*a*c)))/(2.0*a),(-b-Math.sqrt(b**2.0-(4.0*a*c)))/(2.0*a)
end
# The ABC formula for solving quadratic equations (where the solutions are real numbers).
def abc(a, b, c)
	return (-b+Math.sqrt(b**2.0-(4.0*a*c)))/(2.0*a),(-b-Math.sqrt(b**2.0-(4.0*a*c)))/(2.0*a)
end
// The ABC formula for solving quadratic equations (where the solutions are real numbers).
// Restrict to integer solutions.
int[] abc(int A, int B, int C) {
	real a = to!real(A), b = to!real(B), c = to!real(C);
	return [to!int((-b+sqrt((b^^2.0)-(4.0*a*c)))/(2.0*a)),to!int((-b-sqrt((b^^2.0)-(4.0*a*c)))/(2.0*a))];
}

int max(int[] ints) {
	return std.algorithm.max(ints[0], ints[1]);
}

Next come the functions for getting the polygonal numbers. In the Ruby version, I was able to add the methods directly to the Integer class, using what's known as monkey patching, allowing me to make calls like 3.pentagonal. In D, the methods are global instead, so the call would be pentagonal(3).

Here is where I started noting repetitions in the code. The pentagonal, hexagonal, heptagonal and octagonal functions all follow very similar formulas, and it turns out that all of the functions can be made to use the same one, with one variable inserted in a couple of places. To get the convenience of easy to remember function names without having to type in each one, I used eval to define one function for each number type.

In D, a similar result can be achieved with the mixin compile-time function. Pieces of code that can be executed compile-time can be run to insert values and definitions into the code using this function. There are some limitations on what kinds of code can be run at compile-time – for instance, foreach did not work – But calling a method that calls foreach and returns a value does. I therefore defined methods that constructed code contained in a string, called the methods using mixin, and to get the resulting code defined globally, outside any method, I passed the strings to – you guessed it – mixin. The double mixin was born.
class Integer
	# Return triangular number self
	def triangular
		(self*(1+self))/2
	end

	# Return pentagonal number self
	def pentagonal
		((3*self**2)-self)/2
	end

	# Return hexagonal number self
	def hexagonal
		(2*self**2)-self
	end

	# Return heptagonal number self
	def heptagonal
		((5*self**2)-(3*self))/2
	end

	# Return octagonal number self
	def octagonal
		(3*self**2)-(2*self)
	end
$polygonalTypes = ["triangular", "square", "pentagonal", "hexagonal", "heptagonal", "octagonal"]

class Integer
	# Return the triangular, square, pentagonal, etc. number self.
	# The methods are called triangular, square, etc.
	$polygonalTypes.each_with_index do |name, i|
		eval("def #{name}
			(((#{i}+1)*self**2)-((#{i}-1)*self))/2
		end")
	end
immutable string[6] polygonalTypes = ["triangular", "square", "pentagonal", "hexagonal", "heptagonal", "octagonal"];

// Return the triangular, square, pentagonal, etc. number n.
// The methods are called triangular, square, etc.
string definePolygonals() {
	string Polygonals = "";
	foreach(int i, string name; polygonalTypes)
		Polygonals ~= "int "~name~"(int n) {
			return ((("~to!string(i)~"+1)*n^^2)-(("~to!string(i)~"-1)*n))/2;
		}";
	return Polygonals;
}
mixin("mixin(definePolygonals());");

Just as we have squares, we also have square roots, and there are also triangular roots, pentagonal roots, and so on. These functions are the inverse of the ones above, and this gave me another opportunity to make some heavily abbreviated code.
	# Return the triangular root of self.
	def triangular_root
		return 0 if self < 1
		abc(1, 1, -(2*self)).max
	end

	# Return the pentagonal root of self.
	def pentagonal_root
		return 0 if self < 1
		((Math.sqrt((24*self)+1)+1)/6)
	end

	# Return the hexagonal root of self.
	def hexagonal_root
		return 0 if self < 1
		((Math.sqrt((8*self)+1)+1)/4)
	end

	# Return the heptagonal root of self.
	def heptagonal_root
		return 0 if self < 1
		abc(5, -3, -(2*self)).max
	end

	# Return the octagonal root of self.
	def octagonal_root
		return 0 if self < 1
		abc(3, -2, -self).max
	end
	# Return the triangular, square, etc. root of self.
	# The methods are called triangular_root, square_root, etc.
	$polygonalTypes.each_with_index do |name, i|
		eval("def #{name}_root
			return 0 if self < 1
			abc(#{i}+1, -(#{i}-1), -(2*self)).max
		end")
	end
// Return the triangular, square, etc. root of n.
// The methods are called triangular_root, square_root, etc.
string definePolygonalRoots() {
	string PolygonalRoots = "";
	foreach(int i, string name; polygonalTypes)
		PolygonalRoots ~= "int "~name~"_root(int n) {
			if (n < 1) return 0;
			return max(abc("~to!string(i)~"+1, -("~to!string(i)~"-1), -(2*n)));
		}";
	return PolygonalRoots;
}
mixin("mixin(definePolygonalRoots());");

These next_ methods return the next square, triangular, pentagonal, etc. number after the one they're called on. For instance, 4.next_square will return 9.
	# Return the next square number after self.
	def next_square
		((Math.sqrt(self).to_i + 1)**2).to_i
	end

	# Return the next triangular, pentagonal, etc. number after self.
	def next_triangular
		(triangular_root.to_i + 1).triangular
	end

	def next_pentagonal
		(pentagonal_root.to_i + 1).pentagonal
	end

	def next_hexagonal
		(hexagonal_root.to_i + 1).hexagonal
	end

	def next_heptagonal
		(heptagonal_root.to_i + 1).heptagonal
	end

	def next_octagonal
		(octagonal_root.to_i + 1).octagonal
	end
end
	# Return the next triangular, square, etc. number after self.
	# The methods are called next_triangular, next_square, etc.
	$polygonalTypes.each do |name|
		eval("def next_#{name}
			(#{name}_root.to_i + 1).#{name}
		end")
	end
end
// Return the next triangular, square, etc. number after n.
// The methods are called next_triangular, next_square, etc.
string defineNextMethods() {
	string nextMethods = "";
	foreach(string name; polygonalTypes)
		nextMethods ~= "int next_" ~ name ~ "(int n) { return " ~ name ~ "(" ~ name ~ "_root(n) + 1); }\n";
	return nextMethods;
}
mixin("mixin(defineNextMethods());");

The generators, called Enumerables in Ruby and .Net parlance, or Ranges in D, yield sequences of the polygonal numbers that can be iterated through and filtered using expressions that are fairly easy to chain together. One generator yielding one type of number can also be swapped for another, allowing different number types to be used in different positions in the chain by swapping the generators around.

The D version was a good deal more awkward to write, as the D library functions for ranges don't have all the bells and whistles as Ruby's Enumerables, and D being a typed language, the different ranges all had to implement a common interface, or alternatively derive from a common base type. But I got there in the end:
# Generators that will yield infinite streams of triangular, square, pentagonal, etc. numbers.
class TriangularGenerator
	include Enumerable

	# Start at the number equal to or larger than n.
	def initialize n
		@current = (n-1).next_triangular
	end

	def each
		loop do
			yield @current
			@current = @current.next_triangular
		end
	end
end

class SquareGenerator
	include Enumerable

	# Start at the number equal to or larger than n.
	def initialize n
		@current = (n-1).next_square
	end

	def each
		loop do
			yield @current
			@current = @current.next_square
		end
	end
end

class PentagonalGenerator
	include Enumerable

	# Start at the number equal to or larger than n.
	def initialize n
		@current = (n-1).next_pentagonal
	end

	def each
		loop do
			yield @current
			@current = @current.next_pentagonal
		end
	end
end

class HexagonalGenerator
	include Enumerable

	# Start at the number equal to or larger than n.
	def initialize n
		@current = (n-1).next_hexagonal
	end

	def each
		loop do
			yield @current
			@current = @current.next_hexagonal
		end
	end
end

class HeptagonalGenerator
	include Enumerable

	# Start at the number equal to or larger than n.
	def initialize n
		@current = (n-1).next_heptagonal
	end

	def each
		loop do
			yield @current
			@current = @current.next_heptagonal
		end
	end
end

class OctagonalGenerator
	include Enumerable

	# Start at the number equal to or larger than n.
	def initialize n
		@current = (n-1).next_octagonal
	end

	def each
		loop do
			yield @current
			@current = @current.next_octagonal
		end
	end
end
# Generators that will yield infinite streams of triangular, square, pentagonal, etc. numbers.
$polygonalTypes.each do |name|
	eval("class #{name.capitalize}Generator
		include Enumerable

		# Start at the number equal to or larger than n.
		def initialize n
			@current = (n-1).next_#{name}
		end

		def each
			loop do
				yield @current
				@current = @current.next_#{name}
			end
		end
	end")
end
// Generators that will yield infinite streams of triangular, square, pentagonal, etc. numbers.
string defineGenerators() {
	string generators = "interface NumberGenerator {
		@property bool empty();
		@property int front();
		NumberGenerator set(int);
		void popFront();
		NumberGenerator takeMax(int);
	}
	";

	foreach(string name; polygonalTypes)
		generators ~= "class "~capitalize(name)~"Generator : NumberGenerator {

			int current = 1;
			int maxValue = int.min;
			bool isEmpty = false;

			@property bool empty() {
				return isEmpty;
			}

			@property int front() {
				return current;
			}

			"~capitalize(name)~"Generator takeMax(int n) {
				maxValue = n;
				return this;
			}

			// Start at the number equal to or larger than n.
			"~capitalize(name)~"Generator set(int n) {
				current = next_"~name~"(n-1);
				isEmpty = false;
				return this;
			}

			void popFront() {
				current = next_"~name~"(current);
				if(maxValue != int.min && current > maxValue) isEmpty = true;
			}
		}";
	return generators;
}
mixin("mixin(defineGenerators());");

Now for the crown jewel. This construction of six generators inside one another is somewhat similar to a loop-inside-a-loop-inside-a-loop-inside-a-loop-inside-a-loop-inside-a-loop, generating and filtering square and triangular and pentagonal and other polygonal numbers and passing on those that fit together. The numbers are all supposed to be four digits long and share their first two digits with the previous number and their last two digits with the next one, so that's what all of these filters are looking for.

D doesn't know how to permute arrays of objects it can't sort, so in D I defined one array of generators and one array of indices which D does know how to permute, and then I defined a function which looks for the circular chain of numbers, and which takes the generators, found by using the indices, as parameters. The effect is the same in either case.
# Find a circular chain of six four-digit numbers, each of a different polygonal type, where the first two digits of each number equal the last two digits of the previous one.
if $0 == __FILE__

	# Combine the six streams of numbers in all possible orders.
	generators = [TriangularGenerator, SquareGenerator, PentagonalGenerator, HexagonalGenerator, HeptagonalGenerator, OctagonalGenerator]
	generators.permutation.each do |generator1, generator2, generator3, generator4, generator5, generator6|

		# Select six four-digit numbers, each from a separate generator, so that they form a circular chain, where the first two digits of each number equal the last two digits of the previous one.
		generator1.new(1010).take_while{|n| n < 10000}.select{|n| n.to_s[2] != "0"}.each do |num1|

			generator2.new((num1.to_s[2..3] + "10").to_i).take_while{|n| n < 10000 && num1.to_s[2..3] == n.to_s[0..1]}.each do |num2|

				generator3.new((num2.to_s[2..3] + "10").to_i).take_while{|n| n < 10000 && num2.to_s[2..3] == n.to_s[0..1]}.each do |num3|

					generator4.new((num3.to_s[2..3] + "10").to_i).take_while{|n| n < 10000 && num3.to_s[2..3] == n.to_s[0..1]}.each do |num4|

						generator5.new((num4.to_s[2..3] + "10").to_i).take_while{|n| n < 10000 && num4.to_s[2..3] == n.to_s[0..1]}.each do |num5|

							generator6.new((num5.to_s[2..3] + "10").to_i).take_while{|n| n < 10000 && num5.to_s[2..3] == n.to_s[0..1]}.select{|n| num1.to_s[0..1] == n.to_s[2..3]}.each do |num6|

								# There's only one such chain. Display the result and exit.
								puts num1.to_s<<" + "<<num2.to_s<<" + "<<num3.to_s<<" + "<<num4.to_s<<" + "<<num5.to_s<<" + "<<num6.to_s<<" = "<<[num1,num2,num3,num4,num5,num6].inject(:+).to_s
								exit

							end
						end
					end
				end
			end
		end
	end
end
# Find a circular chain of six four-digit numbers, each of a different polygonal type, where the first two digits of each number equal the last two digits of the previous one.
if $0 == __FILE__

	# Chain together the six streams of polygonal numbers in all possible orders.
	polygonals = eval("[" << $polygonalTypes.map{|name| name.capitalize + "Generator"}.join(", ") << "]")

	# Upon finding a number that may yield a valid chain, create a new generator and search for the second number in that chain, and if found, a third generator, etc.
	# Continue until a chain with six valid numbers is found.
	code = "num0 = 1010\n"
	(1..6).each do |i|
		code += "polygonal#{i.to_s}.new((num#{(i-1).to_s}.to_s[2..3] + \"10\").to_i).take_while{|n| n < 10000 && ("+i.to_s+" == 1 ||
		num#{(i-1).to_s}.to_s[2..3] == n.to_s[0..1])}.select{|n| "+i.to_s+" < 6 || num1.to_s[0..1] == n.to_s[2..3]}.each do |num#{i.to_s}|\n"
	end

	# There's only one chain of six numbers matching the criteria. As soon as the generators generate one, display the chain and exit.
	code += "puts "
	(1..6).each {|i| code += "num"+i.to_s+".to_s+\" "+(i<6 ? "+ \"+" : "= \"+
	[num1,num2,num3,num4,num5,num6].inject(:+).to_s\n
	exit\n")}
	code += ("end\n" * 6)

	polygonals.permutation.each do |polygonal1, polygonal2, polygonal3, polygonal4, polygonal5, polygonal6|
		eval(code)
	end
end
// Find a circular chain of six four-digit numbers, each of a different polygonal type, where the first two digits of each number equal the last two digits of the previous one.
string defineFindChain() {
	string code = "void findChain(";
	for(int i=1;i<7;i++) code ~= "NumberGenerator polygonal"~to!string(i)~(i<6 ? "," : ") {\nint num0 = 1010;\n");

	// Upon finding a number that may yield a valid chain, create a new generator and search for the second number in that chain, and if found, a third generator, etc.
	// Continue until a chain with six valid numbers is found.
	for(int i=1;i<7;i++)
		code ~= "foreach(int num"~to!string(i)~"; filter!(n => "~to!string(i)~" == 1 || (("~to!string(i)~" < 6 || to!string(num"~to!string(std.algorithm.max(i-5, 0))~")[0..2] == to!string(n)[2..4]) && to!string(num"~to!string(i-1)~")[2..4] == to!string(n)[0..2]))(polygonal"~to!string(i)~".set(to!int(to!string(num"~to!string(i-1)~")[2..4] ~ \"10\")).takeMax(9999))) {\n";

	// There's only one chain of six numbers matching the criteria. As soon as the generators generate one, display the chain and exit.
	code ~= "writefln(";
	for(int i=1;i<7;i++) code ~= "to!string(num"~to!string(i)~")~\" "~(i<6 ? "+ \"~" : "= \"\n~to!string(reduce!((a,b) => a+b)(0, [num1,num2,num3,num4,num5,num6])));
	exit(0);");
	for(int i=0;i<7;i++) code ~= "}";

	return code;
}
mixin("mixin(defineFindChain());");

string defineMain() {
	string main = "int main(string[] argv) {
	// Chain together the six streams of polygonal numbers in all possible orders.
	NumberGenerator[6] polygonals = new NumberGenerator[6];\n";
	foreach(int i, string name; polygonalTypes) main ~= "\tpolygonals["~to!string(i)~"] = new "~capitalize(name)~"Generator();\n";
	main ~= "\tint[] indices = [0, 1, 2, 3, 4, 5];
	do {
		findChain(";
	for(int i=0;i<6;i++) main ~= "polygonals[indices["~to!string(i)~(i<5 ? "]]," : "]]);
	} while(nextPermutation(indices));
	return 0;
	}");
	return main;
}
mixin("mixin(defineMain());");

Porting a Ruby script that's so Ruby, it's almost Perl to a statically typed, compiled language is a bit of a challenge. I had to take care when implementing the ranges, as they are somewhat more demanding to implement and use than Ruby's Enumerables. The double mixins with the method calls also took a while to figure out, after the convenience of eval. On the other hand, D is so expressive and so flexible that it can be done, and as with the previous example with the Python script, the result is not only nearly as concise as the original, and retains the same general structure, but it's also faster.
3 comments|post comment

[07 Jun 2014|02:58pm]
I decided to revisit the programming language D, after having the pleasure of attending Andrei Alexandrescu's talk on D at NDC. Version 2 of the D language has been out for a long time since I ported one of my Python scripts to D version 1 previously, so I started out by testing the same script in D2.

The garbage collector in D2 is obviously greatly improved, and the program ran fine this time with the garbage collector left on.
#!/usr/bin/env python

import sys, re

if len(sys.argv) != 3:
	print "Usage: python %s oldchecksums newchecksums" % sys.argv[0]
	sys.exit(1)
#!/usr/bin/dmd -run

import std.stdio, std.file, std.string, std.regex;

int main(string[] argv) {

	if (argv.length != 3) {
		writefln("Usage: %s oldchecksums newchecksums", argv[0]);
		return 1;
	}

splitLines() seemed to work exactly as it should this time.
# Read the checksum files.

oldfile = open(sys.argv[1], "rb")
newfile = open(sys.argv[2], "rb")

oldlines = sorted(oldfile.readlines())
newlines = sorted(newfile.readlines())

oldfile.close()
newfile.close()
	// Read the checksum files.

	string oldfile = strip(cast(string)read(argv[1]));
	string newfile = strip(cast(string)read(argv[2]));

	string[] oldlines = splitLines(oldfile).sort;
	string[] newlines = splitLines(newfile).sort;

To solve the problem of splitting lines into exactly two fields, I turned to matchFirst() instead of array slicing.
# Parse the data.

# Map paths to checksums and vice versa.
oldpaths = dict()
newpaths = dict()
oldsums = dict()
newsums = dict()

for line in oldlines:
	line = line.strip()
	line = re.split("\s+",line,1)
	checksum = line[0]
	path = line[1]
	oldpaths[path] = checksum
	oldsums[checksum] = path

for line in newlines:
	line = line.strip()
	line = re.split("\s+",line,1)
	checksum = line[0]
	path = line[1]
	newpaths[path] = checksum
	newsums[checksum] = path
// Parse the data.

	// Map paths to checksums and vice versa.
	string[string] oldpaths;
	string[string] newpaths;
	string[string] oldsums;
	string[string] newsums;

	foreach(string line; oldlines) {
		line = strip(line);
		auto fields = std.regex.matchFirst(line, r"\s+");
		string checksum = fields.pre();
		string path = fields.post();
		oldpaths[path] = checksum;
		oldsums[checksum] = path;
	}

	foreach(string line; newlines) {
		line = strip(line);
		auto fields = std.regex.matchFirst(line, r"\s+");
		string checksum = fields.pre();
		string path = fields.post();
		newpaths[path] = checksum;
		newsums[checksum] = path;
	}

This bit remains the same as in the previous version.
# Correlate the data.

newfiles = list()
deletedfiles = list()
movedfiles = list()
modifiedfiles = list()

# Look for new files.
for path, checksum in newpaths.items():
	if (path not in oldpaths) and (checksum not in oldsums):
		newfiles.append(path)

# Look for deleted, modified or moved files.
for path, checksum in oldpaths.items():
	if (path not in newpaths) and (checksum not in newsums):
		deletedfiles.append(path)
	elif (path not in newpaths) and (checksum in newsums):
		newpath = newsums[checksum]
		movedfiles.append( (path, newpath) )
	elif (path in newpaths) and (checksum not in newsums):
		modifiedfiles.append(path)
	// Correlate the data.

	string[] newfiles;
	string[] deletedfiles;
	string[] movedfiles;
	string[] modifiedfiles;

	// Look for new files.
	foreach(string path, string checksum; newpaths) {
		if(!(path in oldpaths) && !(checksum in oldsums))
			newfiles ~= path;
	}

	// Look for deleted, modified or moved files.
	foreach(string path, string checksum; oldpaths) {
		if(!(path in newpaths) && !(checksum in newsums))
			deletedfiles ~= path;
		else if(!(path in newpaths) && (checksum in newsums)) {
			string newpath = newsums[checksum];
			movedfiles ~= path ~ " → " ~ newpath;
		}
		else if((path in newpaths) && !(checksum in newsums))
			modifiedfiles ~= path;
	}

D2 is completely allergic to invalid Unicode, and will not allow treating invalid Unicode sequences as Unicode strings at all. Since my current test data don't contain them anyway, I changed the printf calls back to calls to writefln.
# Present the data.

newfiles = sorted(newfiles)
deletedfiles = sorted(deletedfiles)
movedfiles = sorted(movedfiles)
modifiedfiles = sorted(modifiedfiles)

print

print "Modified files:\n"
for path in modifiedfiles:
	print path

print
print

print "Moved files:\n"
for path, newpath in movedfiles:
	print u"%s \u2192 %s".encode("UTF-8") % (path, newpath)

print
print

print "Deleted files:\n"
for path in deletedfiles:
	print path

print
print

print "New files:\n"
for path in newfiles:
	print path

print
	// Present the data.

	newfiles = newfiles.sort;
	deletedfiles = deletedfiles.sort;
	movedfiles = movedfiles.sort;
	modifiedfiles = modifiedfiles.sort;

	writefln("");

	writefln("Modified files:\n");
	foreach(string path; modifiedfiles)
		writefln(path);

	writefln("");
	writefln("");

	writefln("Moved files:\n");
	foreach(string path; movedfiles)
		writefln(path);

	writefln("");
	writefln("");

	writefln("Deleted files:\n");
	foreach(string path; deletedfiles)
		writefln(path);

	writefln("");
	writefln("");

	writefln("New files:\n");
	foreach(string path; newfiles)
		writefln(path);

	writefln("");

	return 0;
}

Version 2 of the D script totalled 2366 characters across 108 lines, compared to the 1952 characters and 100 lines of the Python version. Again, D was faster than Python, with D taking 0.958 seconds to process the test data I used this time, and Python taking 2.425 seconds.

All in all, D2 offered an improved experience compared to D1, and I look forward to trying out more of the language.
post comment

金藝琳 [20 Apr 2014|07:45pm]


post comment

聖誕快樂! [20 Dec 2013|08:50pm]
#!/usr/bin/env ruby

days = ["first", "second", "third", "fourth", "fifth", "sixth", "seventh", "eighth", "ninth", "tenth", "eleventh", "twelfth"]

things = ["a partridge in a pear tree", "two turtledoves", "three French hens", "four calling birds", "five golden rings", "six geese a-laying", "seven swans a-swimming", "eight maids a-milking", "nine ladies dancing", "ten lords a-leaping", "eleven pipers piping", "twelve drummers drumming"]

days.each_with_index do |day, i|
	puts "On the " + day + " day of Christmas, my true love sent to me:"
	things[0..i].reverse.each_with_index do |thing, j|
		thing = thing.capitalize if j == 0
		print "and " if (i == j and i > 0)
		print thing
		puts "," if j < i
	end
	puts ".\n\n"
end
Merry Christmas, everybody!
1 comment|post comment

I Sverige – För tiden [24 Jun 2012|06:42am]
Since I just recently became employed, I decided to work through the entire summer this year, and gain some more real-life experience as a programmer.

Little did I know I'd be thrown right into a hectic project, which on top of it all takes place in the outskirts of Stockholm. So I now commute there weekly, and spend Saturday and Sunday in Norway.

The city itself is a fine place, and very beautiful, as it's spread across 14 islands in a bay of a very large lake. If I work hard enough, I might even find the time to see it! Wish me luck!

David by Eriksdalsbadet
David by Eriksdalsbadet.
post comment

天長地久 [21 Apr 2012|11:09am]
You look away for one second, and it's next year already. After I graduated for the second time last year to much rejoicing, one would have thought I'd had enough of all that studying, and would long since have joined the ranks of productive, well-adjusted, not to mention employed citizens that we've all heard so much about. Not so. A slight detour took me back to the University, where I was buried in political science books from August until Christmas, and still remain for the time being.

Around Christmas, renewed effort on my part, and a bit of help from Finn, a Norwegian equivalent of Craigslist, more or less catapulted me back into the aforementioned well-adjusted, productive class. If catapulting can take four months, anyway. So here I am enjoying my first week as a Java developer at Lemontree, and being very happy indeed with where I've ended up.

David at Lemontree

See you all again before you can say "檸檬樹".
2 comments|post comment

[06 Apr 2011|08:12pm]
It's been an eventful couple of days here in Oslo. On Monday, the Data Retention Directive which I mentioned earlier was finally passed by the parliament as expected, meaning that the government will shortly begin forcing communications providers to spy on their customers. All thanks to the sanction of a party which claims that it "advocates restrictive practices with respect to electronic registration" and that "the scope of surveillance should be limited to what can be proved to be absolutely necessary". Strange times, especially for someone of my generation, for whom the fall of the Berlin Wall was one of the most memorable events of our childhood.

On Tuesday, I logged into our aptly named university web site studweb and found this:
Degrees attained

Yes, ladies and gentlemen, you're now reading the blog of a bona fide computer scientist. A doer of science!

I'm using my remaining time as a student to the fullest, and have already learned more about the core of Jupiter and about marginal profit this semester than I ever thought possible. Now that I get to add job hunting to that list, it'll be nice to have a diploma with me to show for my efforts.
5 comments|post comment

老大哥在看著你 [09 Mar 2011|10:12pm]
The Norwegian Conservative Party have just announced their support for the so-called Data Retention Directive, an EU directive requiring states to conduct massive surveillance of their citizens by collecting and storing detailed metadata about their communications. As the larger Labour Party already supports the directive, they could now have secured a majority in the Norwegian parliament and be able to implement the directive.

The fact that democratic countries are now targeting such measures against their own citizens, when a few years ago it might have been unthinkable, underlines the fact that as society progresses, it's not necessarily towards freedom. Whether it was 9/11 or technological progress or simply cultural change, large-scale surveillance of individuals on behalf of society has now become an acceptable practice, and part of our daily lives.

I'm not a fan of society, because the very concept is so abstract and impersonal, detached from the people it supposedly represents. If we make up society, shouldn't what's in our interest be in its interest? I've mentioned before how I've read Atlas Shrugged by Ayn Rand. She wrote something else that's very clever: "The only happy society is one of happy individuals. One cannot have a healthy forest made up of rotten trees."

I'm voting for the Liberal People's Party in the coming elections, in an effort to strengthen the individual liberty that we all depend on to live our lives. If you're voting soon, I hope you're voting for something meaningful, too, as it's by following our convictions that we can make the votes really count.
post comment

洗手不干 [17 Dec 2010|11:37am]
Now, this Chromeo song has a very... clean sound:



For those who are learned in the Scandinavian languages, a relative of mine, Ivar Hanssen-Bauer, just released this cute song, about his soulmate-searching satellite dish:

post comment

北鄙之音 [07 Dec 2010|01:42am]
City FM城市之音, the radio station I enjoy both for its taste in music and for its tendency to surprise, played this lovely song today, which turned out to be sung by none other than the talented Lene Marlin. I can't wait to hear what they'll think of next.

4 comments|post comment

[18 Oct 2010|08:27pm]


:3
2 comments|post comment

D [09 Sep 2010|07:41am]
I decided to toy around with the programming language D today, because D is the best letter of the alphabet. D is intended to combine the performance, low-level functionality and much of the syntax of C and C++ with the kind of expressiveness seen in high-level languages.

Thus I took a script I've written in Python, which compares two .md5 files to see what changes have occurred in a set of files over a period of time, stripped out some of the comments and the exception handling, ported it to D and compiled it with the D compilers GDC and LDC to see what would happen. Here's my summary of the experience, along with the Python version of the script on the left, and the D version on the right.

Test-running the D program initially revealed it to be very slow, taking a full 7.75 times as long to run as the final version. Running it through a profiler showed the cause to be the garbage collector, which was slowing the program down due to the creation of a large amount of objects in memory. So in the D version I start out by turning the garbage collector off, and leave it off, since the program only performs one task before it quits.
#!/usr/bin/env python

import sys, re

if len(sys.argv) != 3:
	print "Usage: python %s oldchecksums newchecksums" % sys.argv[0]
	sys.exit(1)
#!/usr/bin/dmd -run

import std.stdio, std.file, std.string, std.regexp, std.c.stdio, std.gc;

int main(string[] argv) {

	std.gc.disable;

	if (argv.length != 3) {
		writefln("Usage: %s oldchecksums newchecksums", argv[0]);
		return 1;
	}

Splitting the file into individual lines presented another problem. Python does this easily with readlines(), while D's splitlines() wasn't quite working. Using strip and regexp instead solved the problem.
# Read the checksum files.

oldfile = open(sys.argv[1], "rb")
newfile = open(sys.argv[2], "rb")

oldlines = sorted(oldfile.readlines())
newlines = sorted(newfile.readlines())

oldfile.close()
newfile.close()
	// Read the checksum files.

	string oldfile = strip(cast(string)read(argv[1]));
	string newfile = strip(cast(string)read(argv[2]));

	string[] oldlines = std.regexp.split(oldfile,"\n").sort;
	string[] newlines = std.regexp.split(newfile,"\n").sort;

I never got round to finding out whether D's std.regexp.split() function is able to split a line into just two fields, as Python's re.split() can. Still, std.regexp.find(), along with some convenient array slicing code, did the job.
# Parse the data.

# Map paths to checksums and vice versa.
oldpaths = dict()
newpaths = dict()
oldsums = dict()
newsums = dict()

for line in oldlines:
	line = line.strip()
	line = re.split("\s+",line,1)
	checksum = line[0]
	path = line[1]
	oldpaths[path] = checksum
	oldsums[checksum] = path

for line in newlines:
	line = line.strip()
	line = re.split("\s+",line,1)
	checksum = line[0]
	path = line[1]
	newpaths[path] = checksum
	newsums[checksum] = path
	// Parse the data.

	// Map paths to checksums and vice versa.
	string[string] oldpaths;
	string[string] newpaths;
	string[string] oldsums;
	string[string] newsums;

	foreach(string line; oldlines) {
		line = strip(line);
		int space = std.regexp.find(line, r"\s");
		string checksum = strip(line[0..space]);
		string path = strip(line[space..$]);
		oldpaths[path] = checksum;
		oldsums[checksum] = path;
	}

	foreach(string line; newlines) {
		line = strip(line);
		int space = std.regexp.find(line, r"\s");
		string checksum = strip(line[0..space]);
		string path = strip(line[space..$]);
		newpaths[path] = checksum;
		newsums[checksum] = path;
	}

The Python version stores moved files as a tuple of the old path and the new one, whereas the D version merges the two, along with a little Unicode arrow, into one string.
# Correlate the data.

newfiles = list()
deletedfiles = list()
movedfiles = list()
modifiedfiles = list()

# Look for new files.
for path, checksum in newpaths.items():
	if (path not in oldpaths) and (checksum not in oldsums):
		newfiles.append(path)

# Look for deleted, modified or moved files.
for path, checksum in oldpaths.items():
	if (path not in newpaths) and (checksum not in newsums):
		deletedfiles.append(path)
	elif (path not in newpaths) and (checksum in newsums):
		newpath = newsums[checksum]
		movedfiles.append( (path, newpath) )
	elif (path in newpaths) and (checksum not in newsums):
		modifiedfiles.append(path)
	// Correlate the data.

	string[] newfiles;
	string[] deletedfiles;
	string[] movedfiles;
	string[] modifiedfiles;

	// Look for new files.
	foreach(string path, string checksum; newpaths) {
		if(!(path in oldpaths) && !(checksum in oldsums))
			newfiles ~= path;
	}

	// Look for deleted, modified or moved files.
	foreach(string path, string checksum; oldpaths) {
		if(!(path in newpaths) && !(checksum in newsums))
			deletedfiles ~= path;
		else if(!(path in newpaths) && (checksum in newsums)) {
			string newpath = newsums[checksum];
			movedfiles ~= path ~ " → " ~ newpath;
		}
		else if((path in newpaths) && !(checksum in newsums))
			modifiedfiles ~= path;
	}

This is where it gets strange, because some of my file names are invalid Unicode, and the printing function that's available in GDC and LDC insists upon validating each string. So instead of the usual call to writefln(), I call C's printf() with a special format specifier to print each name.

On the other hand, Python requires the use of a bizarre ".encode" call in order to show a little Unicode arrow between the old and new paths of moved files without choking, while D accepted it quite happily in the previous code block. So we can call that a possible tie.
# Present the data.

newfiles = sorted(newfiles)
deletedfiles = sorted(deletedfiles)
movedfiles = sorted(movedfiles)
modifiedfiles = sorted(modifiedfiles)

print

print "Modified files:\n"
for path in modifiedfiles:
	print path

print
print

print "Moved files:\n"
for path, newpath in movedfiles:
	print u"%s \u2192 %s".encode("UTF-8") % (path, newpath)

print
print

print "Deleted files:\n"
for path in deletedfiles:
	print path

print
print

print "New files:\n"
for path in newfiles:
	print path

print
	// Present the data.

	newfiles = newfiles.sort;
	deletedfiles = deletedfiles.sort;
	movedfiles = movedfiles.sort;
	modifiedfiles = modifiedfiles.sort;

	writefln;

	writefln("Modified files:\n");
	foreach(string path; modifiedfiles)
		printf("%.*s\n",path);

	writefln;
	writefln;

	writefln("Moved files:\n");
	foreach(string path; movedfiles)
		printf("%.*s\n",path);

	writefln;
	writefln;

	writefln("Deleted files:\n");
	foreach(string path; deletedfiles)
		printf("%.*s\n",path);

	writefln;
	writefln;

	writefln("New files:\n");
	foreach(string path; newfiles)
		printf("%.*s\n",path);

	writefln;

	return 0;
}

All in all, the D version totalled 2442 characters and 110 lines, while the Python version consisted of a modest 1952 characters over 100 lines. Speedwise, D was initially slower, with the Python version taking 1.74 seconds to go though the files in my home directory, and the D version taking a full 5.27 seconds, but with the garbage collector turned off, D took only 0.68 seconds.

Despite its remaining problems, D turned out to be fairly simple to use even when compared to Python, and would probably compare very favourably to Java, not to mention C++, in terms of programmer productivity. Whether the problems with the garbage collector will be solved remains to be seen. As they say, "more research is needed".
1 comment|post comment

[27 Aug 2010|11:48pm]
Music time again. I want to know who this man's choreographer is. ↓


post comment

navigation
[ viewing | most recent entries ]
[ go | earlier ]