Any reason not to use '+' to concatenate two strings?


Question

A common antipattern in Python is to concatenate a sequence of strings using + in a loop. This is bad because the Python interpreter has to create a new string object for each iteration, and it ends up taking quadratic time. (Recent versions of CPython can apparently optimize this in some cases, but other implementations can't, so programmers are discouraged from relying on this.) ''.join is the right way to do this.

However, I've heard it said (including here on Stack Overflow) that you should never, ever use + for string concatenation, but instead always use ''.join or a format string. I don't understand why this is the case if you're only concatenating two strings. If my understanding is correct, it shouldn't take quadratic time, and I think a + b is cleaner and more readable than either ''.join((a, b)) or '%s%s' % (a, b).

Is it good practice to use + to concatenate two strings? Or is there a problem I'm not aware of?

1
116
5/23/2017 12:10:05 PM

Accepted Answer

There is nothing wrong in concatenating two strings with +. Indeed it's easier to read than ''.join([a, b]).

You are right though that concatenating more than 2 strings with + is an O(n^2) operation (compared to O(n) for join) and thus becomes inefficient. However this has not to do with using a loop. Even a + b + c + ... is O(n^2), the reason being that each concatenation produces a new string.

CPython2.4 and above try to mitigate that, but it's still advisable to use join when concatenating more than 2 strings.

108
1/10/2016 12:44:28 PM

Plus operator is perfectly fine solution to concatenate two Python strings. But if you keep adding more than two strings (n > 25) , you might want to think something else.

''.join([a, b, c]) trick is a performance optimization.


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon