Tuesday, May 22, 2012

Point-free and arrows in Ruby

Extending point free functionality I came to a conclusion that using Array as a base for function fanout/composition was not really a good idea. From one side it's concise and convenient, but using "direction" doesn't look right, it can't be easily extended to more complex cases, messing with Array#to_proc would better be avoided, and last but not least instantiating Array object and calling to_proc each time leads to possible performance issue. Besides, there is a syntax which makes same thing almost as concise as an Array, and it's Proc#[], and Proc object doesn't need to be converted when passing as a block. So let's start with fanout/composition:
PComp = 
  lambda do |*xs|
    if xs.empty?
      PId
    else
      ph, *ps = xs.map(&:to_proc)
      lambda do |*ys|
        r = ph[*ys]
        ps.each do |p|
          r = p[r]
        end
        r
      end
    end
  end
  
PSplat =
  lambda do |*xs|
    ps = xs.map(&:to_proc)
    lambda { |*ys| ps.map { |p| p[*ys] } }
  end
So example in my previous post can be rewritten as:
Order.find(...) \
  .map(&PComp[:order_items, 
              :map.with_args(&PComp[:item, :item_group, PSplat[:id, :name]])]) \
  .inject([], &:<<).uniq
Slightly longer, but still looks good and action is obvious here so I don't need to count brackets. Now it also looks quite similar to Haskell arrows. PSplat resembles &&&, only Ruby isn't strong typed so I don't need to limit myself with a fixed number of elements (pair in Haskell). What about *** then? Well, let's call it PZip:
PZip =
  lambda do |*xs|
    ps = xs.map(&:to_proc)
    lambda do |ys| 
      ps.zip(ys).map do |p, e|
        p[e]
      end
    end
  end
How does this help? Let's say I need to find frequencies:
[1,2,2,3,3,3,4,5].group_by(&lambda { |x| x }).map{ |k, v| [k, v.length] }
First of all lambda { |x| x } is used pretty often in point-free style (Haskell's id), why not define it as:
PId = 
  lambda do |x| 
    x
  end
Now I can rewrite frequencies as:
[1,2,2,3,3,3,4,5].group_by(&PId).map(&PZip[PId, :length])
In fact in this particular example PZip acts like Haskell's second, but it's just because I chose this simple case. Now let's say I want to find all orders that match certain criteria:
Order.find(:all).select{ |order| order.customer.has_discount? && order.paid? && order.shipped? }
I can rewrite it as:
Order.find(:all).select(&PComp[PSplat[PComp[:customer, :has_discount?], :paid?, :shipped?], :all?])
or just
Order.find(:all).select(&PAll[PComp[:customer, :has_discount?], :paid?, :shipped?])
if I define additional lambdas (and it will be faster as they use lazy evaluation in all?(&block) and any?(&block)):
PAny =
  lambda do |*xs|
    ps = xs.map(&:to_proc)
    lambda { |*ys| ps.any? { |p| p[*ys] } }
  end

PAll =
  lambda do |*xs|
    ps = xs.map(&:to_proc)
    lambda { |*ys| ps.all? { |p| p[*ys] } }
  end

Friday, November 25, 2011

Symbol#to_proc and point free programming in Ruby

Let's say we have this schema:

class Order < ActiveRecord::Base
  has_many :order_items
end

class OrderItem < ActiveRecord::Base
  belongs_to :item
end

class Item < ActiveRecord::Base
  belongs_to :item_group
end

class ItemGroup < ActiveRecord::Base
end

And what I want to do is to find all item groups for certain orders. Straightforward approach:

Order.find(...).map { |order| 
  order.order_items.map { |order_item| 
    order_item.item.item_group 
  } 
}.flatten.uniq

Let's start from the inner part:

order_items.map { |order_item| order_item.item.item_group }

If I wanted to find items, then it would be much shorter:

order_items.map(&:item)

Could it be easier if I were able to write

order_items.map(&[:item, :item_group])

Certainly it could and all I need to do is to define Array#to_proc:

class Array
  def to_proc
    raise ArgumentError.new("empty collection") if empty?
    psh, *pst = map(&:to_proc)
    lambda do |*args| 
      pst.inject(psh[*args]) { |v, p| p[v] }
    end
  end
end

so my original statement can be rewritten as

Order.find(...).map{ |order| order.order_items.map(&[:item, :item_group] }.flatten.uniq

What about the outer part? To find all items I could use

Order.find(...).map(&[:order_items, :map?])

But how am I going to pass a block? There are two steps involved. First, I want to convert symbol to a lambda, which calls block behind the scene, and I also want my Array#to_proc to take Proc objects. Second step is trivial:

class Array
  def to_proc
    raise ArgumentError.new("empty collection") if empty?
    ps = 
      map do |e|
        case e 
        when Symbol
          e.to_proc
        when Proc
          e
        end
      end
    psh, *pst = ps
    lambda do |*args| 
      pst.inject(psh[*args]) { |v, p| p[v] }
    end
  end
end

And now I want to define Symbol#with_args which is going to call a method with arguments and block:

class Symbol
  def with_args(*args, &block)
    p = lambda { |e| e.__send__(self, *args, &block) }
    lambda { |e| p[e] }
  end
end

So original statement becomes even shorter

Order.find(...).map(&[:order_items, :map.with_args(&[:item, :item_group]).flatten.uniq

So I finally removed all "points", i. e. any mentionigs of block arguments.

Why this is important and what I have actually saved here? Well, Ruby has what I can call "loose" scoping, block arguments don't shadow outer scope, neither Ruby produce any warning about that. In deeply nested statements naming can become a serious issue. Of course, I could create small method for every step with it's own scope, but it doesn't help readability at all.

Now, Ruby has a nice deconstruction of block arguments

[[1,[2,3]], [4,[5,6]]].map { |x, (y, z)| z } # => [3,6]

and I want to prepare data for my view, and in fact I don't need whole item group model, I just need to keep name and id so I can write in my view

%ul
  - item_groups.each do |id, name|
    %li= link_to name, item_group_path(id)

my controller code then becomes

item_groups = Order.....map{ |item_group| [item_group.id, item_group.name] }

Now we have the same "point" again, we have to name item_group to convert it to a pair. If we could only write

...map(&[:id, :name])

but this place is already taken. Well, is it? What if I pass nested array and will change "direction" with each level (I will name previous case as "vertical", and this new way as "horizontal"):

class Array
  def to_proc(horizontal = nil)
    raise ArgumentError.new("empty collection") if empty?
    ps = 
      map do |e|
        case e 
        when Array
          e.to_proc(!horizontal)
        when Symbol
          e.to_proc
        when Proc
          e
        end
      end
    if horizontal
      lambda { |*args| ps.map { |p| p[*args] } }
    else
      psh, *pst = ps
      lambda do |*args| 
        pst.inject(psh[*args]) { |v, p| p[v] }
      end
    end
  end
end

Now we can write

...map(&[[:id, :name]])

Wait a second, now the whole statement can be written differently

Order.find(...).map(&[:order_items, 
                      :map.with_args(&[:item, 
                                       :item_group, 
                                       [:id, :name]])]).flatten(1).uniq

Meanwhile just remembering good old Symbol#to_proc I can rewrite it as

Order.find(...) \
  .map(&[:order_items, 
         :map.with_args(&[:item, :item_group, [:id, :name]])]) \
  .inject([], &:<<).uniq

or even

Order.find(...) \
  .map(&[:order_items, 
         :map.with_args(&[:item, :item_group, [:id, :name]])]) \
  .inject(Set.new, &:+).to_a

which brings us to another point -- can we actually join map and inject into one inject? There is a small problem here. Imagine we want to find sum of squares

[1,2,3].inject(0) { |a, x| a + x ** 2 }

can we write it in point-free style? Something like

[1,2,3].inject(0, &[:**.with_args(2), :+])

This obviously won't work as inject passes a and x, so what is actually getting to ** is a, not x, and then the result of ** is a number, not an array needed for sum. We need to somehow "slice" arguments and apply ** to a second one, but return both of them, and then "splat" an array and call block with more than one argument

[1,2,3].inject(0, &[slice_args(1, &:**.with_args(2)), :splat.with_args(&:+)])

We need to add two more methods:

def slice_args(index = 0, &block)
  lambda do |*args|
    args[index] = block[args[index]]
    args
  end
end

class Array
  def splat(&block)
    block[*self]
  end
end

So original example can be rewritten as

Order.find(...).inject(Set.new, 
                       &[slice_args(1, &[:order_items, 
                                         :map.with_args(&[:item, 
                                                          :item_group, 
                                                          [:id, :name]])]), 
                         :splat.with_args(&:+)]).to_a

There is a pattern here. I take an object and apply some transformation (e.g. an array) to it and get similar structure as a result. What if I use hash instead of an array? Obviously I should get hash back. To make this happen I define Hash#to_proc:

class Hash
  def kv_map(&block)
    block ||= k_lambda
    inject(self.class.new) do |kv, (k, v)|
      kv[k] = block[v]
      kv
    end
  end

  def to_proc
    ps = 
      kv_map do |e|
        case e
        when Array, Hash
          e.to_proc
        when Symbol
          e.to_proc
        when Proc
          e
        else
          lambda { e }
        end
      end
    lambda { |*args| ps.kv_map{ |p| p[*args] } }
  end
end

class Array
  def to_proc(horizontal = nil)
    raise ArgumentError.new("empty collection") if empty?
    ps = 
      map do |e|
        case e 
        when Array
          e.to_proc(!horizontal)
        when Hash
          e.to_proc
        when Symbol
          e.to_proc
        when Proc
          e
        else
          lambda { e }
        end
      end
    if horizontal
      lambda { |*args| ps.map { |p| p[*args] } }
    else
      psh, *pst = ps
      lambda do |*args| 
        pst.inject(psh[*args]) { |v, p| p[v] }
      end
    end
  end
end

Now I can write things like

K_LAMBDA = lambda{ |x| x }

[1,2,3].map(&{:i => K_LAMBDA, :i_sq => :**.with_args(2)}) 
#=> [{:i_sq=>1, :i=>1}, {:i_sq=>4, :i=>2}, {:i_sq=>9, :i=>3}]

All of this may seem a bit theoretical, but it does save some typing and make code more readable if used wisely.

Thursday, June 16, 2011

Ruby thread pool in Erlang style

require "thread"

POOL_SIZE = 5

items_to_process = (0..100).to_a

message_queue = Queue.new

start_thread = 
  lambda do
    Thread.new(items_to_process.shift) do |i|
      puts "Processing #{i}"
      message_queue.push(:done)
    end
  end

items_left = items_to_process.length

[items_left, POOL_SIZE].min.times do
  start_thread[]
end

while items_left > 0 
  message_queue.pop
  items_left -= 1
  start_thread[] unless items_left < POOL_SIZE
end

Monday, May 9, 2011

JS-like containers in Rails

Quite often I need to pass complex data structure to a view. Obvious solution would be a hash, the drawback is that too much typing involved. Impressed by Javascript syntax I decided to create similar object in Ruby, and it was easy:

require "set"

class JContainer
  include Enumerable
  
  def initialize(props = {})
    @props = Set.new
    set_props(props)
  end
  
  def [](k)
    instance_variable_get("@#{k}")
  end
  
  def []=(k, v)
    k_sym = k.to_sym
    unless @props.include?(k_sym)
      @props.add(k_sym)
      (
        class << self
          self
        end
      ).instance_eval do 
        attr_accessor k_sym
      end
    end
    instance_variable_set("@#{k}", v)
    v
  end

  def ==(j_container)
    self.class === j_container && props == j_container.props
  end
  
  def set_props(props)
    props.each{ |k, v| self[k] = v }
    self
  end
  
  def props
    Hash[*(@props.map{ |p| [p, self[p]] }.flatten(1))]
  end

  def to_json
    props.to_json
  end
  
  def each
    props.each{ |k, v| yield [k, v]}
  end

  private
  
  def method_missing(k, v = nil)
    k_s = k.to_s
    if k_s.chomp!("=")
      self[k_s] = v
    else
      self[k]
    end
  end
end
So now I can do things like:
js = JContainer.new(:x => 0, :y => 1, :z => 2)
js.a = [1,2,3]
js["b"] = 2
js[:a]
js[:c]
[js.x, js.y]
js.props
JContainer.new(:x => 0, :y => 1, :z => 2) == JContainer.new(:x => 0, :y => 1, :z => 2)
JContainer.new(:x => 0, :y => 1, :z => 2) != JContainer.new(:x => 0, :y => 1, :z => 3)
JContainer.new(:x => 0, :y => 1, :z => 2).map{|k, v| [k, v] }
(0..9).map{|i| JContainer.new(:x => i)}.map(&:x)

Limited argument matching in Ruby

class Object
  def closure(&block)
    block[self]
  end
end

[1,2].closure{ |x, y| x + y } # => 3 
[[1, 2], 3].closure{ |(x, y), z| z + y } # => 5

In fact, closure can also be used for creating "limited namespace" when chaining function calls:

[1,2,3].map{ |e| e * 2 }.closure{ |a| [a[1], a[0] + a[2]] } # => [4, 8]

Sunday, April 24, 2011

Ruby generators

Another exciting way to implement lazy evaluation using generator:

require "generator"

fib =
  Generator.new do |g|
    a, b = 0, 1
    loop do
      g.yield a
      a, b = b, a + b
    end
  end

puts (0..19).map{ fib.next }.inspect

My personal taste though would be:

fib =
  lambda{
    a, b = 0, 1
    lambda{ begin a ensure a, b = b, a + b end }
  }[]

puts (0..19).map{ fib[] }.inspect

I don't quite like the idea of longjmp (callcc in Ruby).

By the way, Ruby 1.9 introduces fibers, which work very similar way to generator.

Monday, January 31, 2011

Erlang make and return code

erl -make doesn't return error code. Simple solution makes it possible:

#!/usr/bin/env escript
%%! -smp enable

main(_) ->
  case make:all() of
    error ->
      halt(1);
    _ ->
      ok
  end.