- str.capitalize() -> str
- str.casefold() -> str [only for Python > 3.3]
- str.center(width[, fillchar]) -> str
- str.count(sub[, start[, end]]) -> int
- str.decode(encoding="utf-8"[, errors]) -> unicode [only in Python 2.x]
- str.encode(encoding="utf-8", errors="strict") -> bytes
- str.endswith(suffix[, start[, end]]) -> bool
- str.expandtabs(tabsize=8) -> str
- str.find(sub[, start[, end]]) -> int
- str.format(*args, **kwargs) -> str
- str.format_map(mapping) -> str
- str.index(sub[, start[, end]]) -> int
- str.isalnum() -> bool
- str.isalpha() -> bool
- str.isdecimal() -> bool
- str.isdigit() -> bool
- str.isidentifier() -> bool
- str.islower() -> bool
- str.isnumeric() -> bool
- str.isprintable() -> bool
- str.isspace() -> bool
- str.istitle() -> bool
- str.isupper() -> bool
- str.join(iterable) -> str
- str.ljust(width[, fillchar]) -> str
- str.lower() -> str
- str.lstrip([chars]) -> str
- static str.maketrans(x[, y[, z]])
- str.partition(sep) -> (head, sep, tail)
- str.replace(old, new[, count]) -> str
- str.rfind(sub[, start[, end]]) -> int
- str.rindex(sub[, start[, end]]) -> int
- str.rjust(width[, fillchar]) -> str
- str.rpartition(sep) -> (head, sep, tail)
- str.rsplit(sep=None, maxsplit=-1) -> list of strings
- str.rstrip([chars]) -> str
- str.split(sep=None, maxsplit=-1) -> list of strings
- str.splitlines([keepends]) -> list of strings
- str.startswith(prefix[, start[, end]]) -> book
- str.strip([chars]) -> str
- str.swapcase() -> str
- str.title() -> str
- str.translate(table) -> str
- str.upper() -> str
- str.zfill(width) -> str
String objects are immutable, meaning that they can't be modified in place the way a list can. Because of this, methods on the built-in type
str always return a new
str object, which contains the result of the method call.
Case insensitive string comparisons
Comparing string in a case insensitive way seems like something that's trivial, but it's not. This section only considers unicode strings (the default in Python 3). Note that Python 2 may have subtle weaknesses relative to Python 3 - the later's unicode handling is much more complete.
The first thing to note it that case-removing conversions in unicode aren't trivial. There is text for which
text.lower() != text.upper().lower(), such as
But let's say you wanted to caselessly compare
"Buße". Heck, you probably also want to compare
"BUẞE" equal - that's the newer capital form. The recommended way is to use
Do not just use
casefold is not available, doing
.upper().lower() helps (but only somewhat).
Then you should consider accents. If your font renderer is good, you probably think
"ê" == "ê" - but it doesn't:
This is because they are actually
The simplest way to deal with this is
unicodedata.normalize. You probably want to use NFKD normalization, but feel free to check the documentation. Then one does
To finish up, here this is expressed in functions:
Changing the capitalization of a string
Python's string type provides many functions that act on the capitalization of a string. These include :
With unicode strings (the default in Python 3), these operations are not 1:1 mappings or reversible. Most of these operations are intended for display purposes, rather than normalization.
str.casefold creates a lowercase string that is suitable for case insensitive comparisons. This is more aggressive than
str.lower and may modify strings that are already in lowercase or cause strings to grow in length, and is not intended for display purposes.
The transformations that take place under casefolding are defined by the Unicode Consortium in the CaseFolding.txt file on their website.
str.upper takes every character in a string and converts it to its uppercase equivalent, for example:
str.lower does the opposite; it takes every character in a string and converts it to its lowercase equivalent:
str.capitalize returns a capitalized version of the string, that is, it makes the first character have upper case and the rest lower:
str.title returns the title cased version of the string, that is, every letter in the beginning of a word is made upper case and all others are made lower case:
str.swapcase returns a new string object in which all lower case characters are swapped to upper case and all upper case characters to lower:
str class methods
It is worth noting that these methods may be called either on string objects (as shown above) or as a class method of the
str class (with an explicit call to
This is most useful when applying one of these methods to many strings at once in say, a
Conversion between str or bytes data and unicode characters
The contents of files and network messages may represent encoded characters. They often need to be converted to unicode for proper display.
In Python 2, you may need to convert str data to Unicode characters. The default (
"", etc.) is an ASCII string, with any values outside of ASCII range displayed as escaped values. Unicode strings are
In Python 3 you may need to convert arrays of bytes (referred to as a 'byte literal') to strings of Unicode characters. The default is now a Unicode string, and bytestring literals must now be entered as
b"", etc. A byte literal will return
isinstance(some_val, byte), assuming
some_val to be a string that might be encoded as bytes.
Counting number of times a substring appears in a string
One method is available for counting the number of occurrences of a sub-string in another string,
str.count(sub[, start[, end]])
str.count returns an
int indicating the number of non-overlapping occurrences of the sub-string
sub in another string. The optional arguments
end indicate the beginning and the end in which the search will take place. By default
start = 0 and
end = len(str) meaning the whole string will be searched:
By specifying a different value for
end we can get a more localized search and count, for example, if
start is equal to
13 the call to:
is equivalent to:
Join a list of strings into one string
A string can be used as a separator to join a list of strings together into a
single string using the
join() method. For example you can create a string
where each element in a list is separated by a space.
The following example separates the string elements with three hyphens.
Python provides functions for justifying strings, enabling text padding to make aligning various strings much easier.
Below is an example of
rjust are very similar. Both have a
width parameter and an optional
fillchar parameter. Any string created by these functions is at least as long as the
width parameter that was passed into the function. If the string is longer than
width alread, it is not truncated. The
fillchar argument, which defaults to the space character
' ' must be a single character, not a multicharacter string.
ljust function pads the end of the string it is called on with the
fillchar until it is
width characters long. The
rjust function pads the beginning of the string in a similar fashion. Therefore, the
r in the names of these functions refer to the side that the original string, not the
fillchar, is positioned in the output string.
Replace all occurrences of one substring with another substring
str type also has a method for replacing occurences of one sub-string with another sub-string in a given string. For more demanding cases, one can use
str.replace(old, new[, count]):
str.replace takes two arguments
new containing the
old sub-string which is to be replaced by the
new sub-string. The optional argument
count specifies the number of replacements to be made:
For example, in order to replace
'spam' in the following string, we can call
old = 'foo' and
new = 'spam':
If the given string contains multiple examples that match the
old argument, all occurrences are replaced with the value supplied in
unless, of course, we supply a value for
count. In this case
count occurrences are going to get replaced:
Reversing a string
A string can reversed using the built-in
reversed() function, which takes a string and returns an iterator in reverse order.
reversed() can be wrapped in a call to
''.join() to make a string from the iterator.
reversed() might be more readable to uninitiated Python users, using extended slicing with a step of
-1 is faster and more concise. Here , try to implement it as function:
Split a string based on a delimiter into a list of strings
str.split takes a string and returns a list of substrings of the original string. The behavior differs depending on whether the
sep argument is provided or omitted.
sep isn't provided, or is
None, then the splitting takes place wherever there is whitespace. However, leading and trailing whitespace is ignored, and multiple consecutive whitespace characters are treated the same as a single whitespace character:
sep parameter can be used to define a delimiter string. The original string is split where the delimiter string occurs, and the delimiter itself is discarded. Multiple consecutive delimiters are not treated the same as a single occurrence, but rather cause empty strings to be created.
The default is to split on every occurrence of the delimiter, however the
maxsplit parameter limits the number of splittings that occur. The default value of
-1 means no limit:
str.rsplit ("right split") differs from
str.split ("left split") when
maxsplit is specified. The splitting starts at the end of the string rather than at the beginning:
Note: Python specifies the maximum number of splits performed, while most other programming languages specify the maximum number of substrings created. This may create confusion when porting or comparing code.
str.format and f-strings: Format values into a string
Python provides string interpolation and formatting functionality through the
str.format function, introduced in version 2.6 and f-strings introduced in version 3.6.
Given the following variables:
The following statements are all equivalent
For reference, Python also supports C-style qualifiers for string formatting. The examples below are equivalent to those above, but the
str.format versions are preferred due to benefits in flexibility, consistency of notation, and extensibility:
The braces uses for interpolation in
str.format can also be numbered to reduce duplication when formatting strings. For example, the following are equivalent:
While the official python documentation is, as usual, thorough enough, pyformat.info has a great set of examples with detailed explanations.
} characters can be escaped by using double brackets:
str.translate: Translating characters in a string
Python supports a
translate method on the
str type which allows you to specify the translation table (used for replacements) as well as any characters which should be deleted in the process.
|It is a lookup table that defines the mapping from one character to another.|
|A list of characters which are to be removed from the string.|
maketrans method (
str.maketrans in Python 3 and
string.maketrans in Python 2) allows you to generate a translation table.
translate method returns a string which is a translated copy of the original string.
You can set the
table argument to
None if you only need to delete characters.
Python makes it extremely intuitive to check if a string contains a given substring. Just use the
Note: testing an empty string will always result in
String module's useful constants
string module provides constants for string related operations. To use them, import the
Contains all lower case ASCII characters:
Contains all upper case ASCII characters:
Contains all decimal digit characters:
Contains all hex digit characters:
Contains all octal digit characters:
Contains all characters which are considered punctuation in the
Contains all ASCII characters considered whitespace:
In script mode,
print(string.whitespace) will print the actual characters, use
str to get the string returned above.
Contains all characters which are considered printable; a combination of
Stripping unwanted leading/trailing characters from a string
Three methods are provided that offer the ability to strip leading and trailing characters from a string:
str.lstrip. All three methods have the same signature and all three return a new string object with unwanted characters removed.
str.strip acts on a given string and removes (strips) any leading or trailing characters contained in the argument
chars is not supplied or is
None, all white space characters are removed by default. For example:
chars is supplied, all characters contained in it are removed from the string, which is returned. For example:
These methods have similar semantics and arguments with
str.strip(), their difference lies in the direction from which they start.
str.rstrip() starts from the end of the string while
str.lstrip() splits from the start of the string.
For example, using
Test the starting and ending characters of a string
In order to test the beginning and ending of a given string in Python, one can use the methods
str.startswith(prefix[, start[, end]])
As it's name implies,
str.startswith is used to test whether a given string starts with the given characters in
The optional arguments
end specify the start and end points from which the testing will start and finish. In the following example, by specifying a start value of
2 our string will be searched from position
2 and afterwards:
s == 'i' and
s == 's'.
You can also use a
tuple to check if it starts with any of a set of strings
str.endswith(prefix[, start[, end]])
str.endswith is exactly similar to
str.startswith with the only difference being that it searches for ending characters and not starting characters. For example, to test if a string ends in a full stop, one could write:
startswith more than one characters can used as the ending sequence:
You can also use a
tuple to check if it ends with any of a set of strings
Testing what a string is composed of
str type also features a number of methods that can be used to evaluate the contents of a string. These are
str.isspace. Capitalization can be tested with
str.isalpha takes no arguments and returns
True if the all characters in a given string are alphabetic, for example:
As an edge case, the empty string evaluates to
False when used with
These methods test the capitalization in a given string.
str.isupper is a method that returns
True if all characters in a given string are uppercase and
str.islower is a method that returns
True if all characters in a given string are lowercase and
True if the given string is title cased; that is, every word begins with an uppercase character followed by lowercase characters.
str.isdecimal returns whether the string is a sequence of decimal digits, suitable for representing a decimal number.
str.isdigit includes digits not in a form suitable for representing a decimal
number, such as superscript digits.
str.isnumeric includes any number values, even if not digits, such as values outside the range 0-9.
bytes in Python 3,
str in Python 2), only support
isdigit, which only checks for basic ASCII digits.
str.isalpha, the empty string evaluates to
This is a combination of
str.isnumeric, specifically it evaluates to
True if all characters in the given string are alphanumeric, that is, they consist of alphabetic or numeric characters:
True if the string contains only whitespace characters.
Sometimes a string looks “empty” but we don't know whether it's because it contains just whitespace or no character at all
To cover this case we need an additional test
But the shortest way to test if a string is empty or just contains whitespace characters is to use
strip(with no arguments it removes all leading and trailing whitespace characters)