Python/Django: How to remove extra white spaces & tabs from a string?


I'm building a website with Python/Django. Users submit tags. Each tag can contain multiple words. Each tag has an ID number. I want to make sure tags that are formatted slightly differently are still being recognized as the same tag.

For example, if one user submitted the tag "electric guitar" and the other submitted "electric   guitar" (2 white spaces between the 2 words) I want to be able to recognize they are the same tag.

How to I remove all the extra white spaces and tabs in this case? Thanks.

11/22/2010 2:30:59 AM

Accepted Answer

Split on any whitespace, then join on a single space.

' '.join(s.split())
11/22/2010 2:33:03 AM

>>> import re
>>> re.sub(r'\s+', ' ', 'some   test with     ugly  whitespace')
'some test with ugly whitespace'

Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow