Ruby Lazy chunked hash like behavior

When we want to iterate a long list, we can simply write a query and get a cursor, ActiveRecord will do all the heavy lifting for us.

What happens when we need to do some complicated computations on a set of data, which sometimes can be too big to be stored in memory for the entire computing process?

This is when we need to start being more creational.

I'd like to introduce what I came up with.

The problem:

- Complex calculation on time based data series for a period of 3 months.
- Each calculation may depends on previous one and on future and past data.
- Must be in order.
- When fetching all data server crash on memory.

The solution:

I wanted to do the most minor code change possible, and currently the data was accessed via a hash. 
I decided to encapsulate the hash with something I called lazy chunked hash (tried google it see it as standard behavior in clojure).

It looks like this:

class ValuesProvider  def initialize()
    @loaded_date = nil
    @hash = Hash.new(0)
end def [](time_slot) get(time_slot) end private def get(time_slot) relevant_date = time_slot.to_date unless relevant_date == @loaded_date load(relevant_date) end @hash[time_slot.to_i] end
end

Pretty simple and does the work, instead of loading the data all at once, the data is being loaded for each day separately, this way we keep it chunky but not too chunky.

And best part, my code that consume the data, didn't change because of the [] method, which makes my ValueProvider behave like an array. 

 This solution is good when the consumer data request(call for[]) implies on what data should be loaded, which most of the times will, but in some cases it won't)

Comments

Popular Posts