I've started on a project graphing Tomcat logs using gnuplot-py, specifically correlating particular requests with memory allocation and garbage collection. What is the collective wisdom on gnuplot-py vs Matplotlib for Python graphing. Are there better graphing libraries out there I haven't heard of?
My general considerations are:
How would you approach this task?
I know this post is old and answered but I was passing by and wanted to put my two cents. Here is my conclusion: if you have a not-so-big data set, you should use Matplotlib. It's easier and looks better. However, if you really need performance, you could use Gnuplot. I've added some code to test it out on your machine and see for yourself if it makes a real difference (this is not a real performance benchmark but should give a first idea).
The following graph represents the required time (in seconds) to:
I remember the performance gap being much wider when running on an older computer with older versions of the libraries (~30 seconds difference for a large scatter plot).
Moreover, as mentionned in the comments, you can get equivalent quality of plots. But you will have to put more sweat into that to do it with Gnuplot.
Here's the code to generate the graph if you want to give it a try on your machine:
# -*- coding: utf-8 -*- from timeit import default_timer as timer import matplotlib.pyplot as plt import Gnuplot, Gnuplot.funcutils import numpy as np import sys import os def mPlotAndSave(x, y): plt.scatter(x, y) plt.savefig('mtmp.png') plt.clf() def gPlotAndSave(data, g): g("set output 'gtmp.png'") g.plot(data) g("clear") def cleanup(): try: os.remove('gtmp.png') except OSError: pass try: os.remove('mtmp.png') except OSError: pass begin = 2 end = 500000 step = 10000 numberOfPoints = range(begin, end, step) n = len(numberOfPoints) gnuplotTime =  matplotlibTime =  progressBarWidth = 30 # Init Gnuplot g = Gnuplot.Gnuplot() g("set terminal png size 640,480") # Init matplotlib to avoid a peak in the beginning plt.clf() for idx, val in enumerate(numberOfPoints): # Print a nice progress bar (crucial) sys.stdout.write('\r') progress = (idx+1)*progressBarWidth/n bar = "▕" + "▇"*progress + "▁"*(progressBarWidth-progress) + "▏" + str(idx) + "/" + str(n-1) sys.stdout.write(bar) sys.stdout.flush() # Generate random data x = np.random.randint(sys.maxint, size=val) y = np.random.randint(sys.maxint, size=val) gdata = zip(x,y) # Generate string call to a matplotlib plot and save, call it and save execution time start = timer() mPlotAndSave(x, y) end = timer() matplotlibTime.append(end - start) # Generate string call to a gnuplot plot and save, call it and save execution time start = timer() gPlotAndSave(gdata, g) end = timer() gnuplotTime.append(end - start) # Clean up the files cleanup() del g sys.stdout.write('\n') plt.plot(numberOfPoints, gnuplotTime, label="gnuplot") plt.plot(numberOfPoints, matplotlibTime, label="matplotlib") plt.legend(loc='upper right') plt.xlabel('Number of points in the scatter graph') plt.ylabel('Execution time (s)') plt.savefig('execution.png') plt.show()