I'm building a website with Python/Django. Users submit tags. Each tag can contain multiple words. Each tag has an ID number. I want to make sure tags that are formatted slightly differently are still being recognized as the same tag.
For example, if one user submitted the tag "electric guitar" and the other submitted "electric guitar" (2 white spaces between the 2 words) I want to be able to recognize they are the same tag.
How to I remove all the extra white spaces and tabs in this case? Thanks.
Split on any whitespace, then join on a single space.
>>> import re >>> re.sub(r'\s+', ' ', 'some test with ugly whitespace') 'some test with ugly whitespace'