“...I've been working since 2008 with Ruby / Ruby on Rails, love a bit of Elixir / Phoenix and learning Rust. I also poke through other people's code and make PRs for OpenSource Ruby projects that sometimes make it. Currently working for InPay...”

Rob Lacey (contact@robl.me)
Senior Software Engineer, Brighton, UK

Ruby fuzzy matching

I am running through a data set which is lots of answers and trying to make the data consistent as human input is always going to be wonky from typos, spaces instead of hiphens to complete misspelling. Found this…

https://github.com/seamusabshere/fuzzy_match

2.3.3 :002 > require 'fuzzy_match'
=> true
2.3.3 :003 > fm = FuzzyMatch.new(['Uwe Rosenberg', 'X-Com'])
 =>#<FuzzyMatch:0x007ff02b05f370 @read=nil, @groupings=[], @identities=[], @stop_words=[], @default_options={:must_match_grouping=>false, :must_match_at_least_one_word=>false, :gather_last_result=>false, :find_all=>false, :find_all_with_score=>false, :threshold=>nil, :find_best=>false, :find_with_score=>false}, @haystack=[w("Uwe Rosenberg"), w("X-Com")]>
2.3.3 :004 > fm.find('Use Roenberb')
=> "Uwe Rosenberg"

Perfect. Time to de-dupe some dodgy data.