Open Source ETL framework


I was asked to prototype two ETL frameworks. The requirements are as follows:

  • Open Source
  • Available to Linux
  • Maintained
  • Logs can be viewed on web browser (nice to have)
  • Written in Perl, Python, Ruby or Java

The raw file can be anything (excel, csv, html page etc..) The target database is MySQL.

Dont just drop names, please indicate the advantages/disadvantages based from your experience.


1/18/2012 3:45:33 AM

Accepted Answer

I've used Kettle. It has its own GUI, but if you rather use the API to do the ETL yourself it's also supported. It has proved to be very useful to me and there are a few plugins already available for it.

1/18/2012 3:55:06 AM

One of the most popular Java based ETL would be Talend.

Jaspersoft ETL is another one extended from Talend and has a nice eclipse based UI.

