KNIME: Joins are the key to using more than one data set

  •  
  •  
  •  
  •  
  •  
  •  

The ability to join data tables together is a foundation of any analytics platform, KNIME included.  You need to have some way to not only join two or more tables together, but to ensure the records are matched based on some key, perhaps on a specific item or customer.

This is where KNIME can come in handy.  The join is incredibly simple.  Load two data sources, then connect them via a join node.  It really is that simple.  If you need to join another data source…well, then just use another join node.

Excel on the other hand, has serious issues with joins.  There are two common ways to do a join in Excel, Vlookup and Index-Match are both popular.  However, both are error prone, not intuitive to read or re-create what someone else wrote, and extremely slow with large amounts of data.  20,000 records will take forever to run, and 200,000 or more may crash your computer entirely.

The join node itself is also simple to use.  It has drag and drop menus with options that can give you all of the different types of joins, right joins, left joins, inner joins, outer joins…

If you are not sure what all of those are, let me know and maybe I will make a video on the various types of joins and what you might use them for.