Daily Code Reading #15 – Flay#process_sexp

Now I’m starting to get into the deep dark corners of flay. The #process_sexp method is the next step in the process.

The Code

1
2
3
4
5
6
7
8
  def process_sexp pt
    pt.deep_each do |node|
      next unless node.any? { |sub| Sexp === sub }
      next if node.mass < self.mass_threshold
 
      self.hashes[node.structural_hash] << node
    end
  end

Review

#process_sexp is running a collection routine to store the parts of each s-expression into the shared hashes data structure. It starts by recursively applying the block to each s-expression using #deep_each. This should make sure that each section of code is looked at for duplication:

  • the class as a whole
  • the methods in the class
  • the code inside a method
  • the single line of code
  • the atoms of each Ruby statement

Each of those gets yielded to the block in #process_sexp which does two checks before adding it to the hashes.

1
next unless node.any? { |sub| Sexp === sub }

This is checking if the current node (s-expression) includes only other s-expressions.

1
next if node.mass < self.mass_threshold

This check is to make sure that only nodes above the mass threshold are included in the report. Flay defaults this to 16 but includes an option to change it.

1
self.hashes[node.structural_hash] << node

Finally, if both those checks above pass, the node is added to the shared hashes structure. From what I see node.structural_hash is a simplified version of the structure that just holds the s-expression types. I think this is what lets flay track how many times a statement was used, similar statements are simplified down to the same structural hash.

If anyone has anything they want to add, please post a comment below. Parsing and s-expressions are not my strengths so I might be totally reading this wrong.