All About String in Python

Introduction and a brief guide to string in python

Dede Kurniawan
InfoSec Write-ups

--

Photo by Jeremy Bishop on Unsplash

Python is a high-level programming language that is easy to learn by people from various backgrounds and is an object-oriented language. Python has various data types such as integer, float, boolean, and string. A string is a sequence of characters. Most programmers will work more with strings than numbers.

How to create a String?

To create a string, you must enclose the character in quotes. Here, you are free to use single quotes (‘...‘) or double quotes (“...”).

>>> print('python') 
python

>>> print("assembly")
assembly

But in practice, you will probably use single quotes and double quotes at the same time. For example, creating a string containing quote characters. Here you can use a single quote inside a double quote or vice versa.

>>> print('Dad says, "Python is a fun programming language!"')
Dad says, "Python is a fun programming language!"

>>> print("Mom sings I don't care")
Mom sings I don't care

But how to create a multiline string? There are two ways to create multiline strings, first, using three single quotes (’’'...''') or three double quotes ("""...”””) and using special characters escaped with backslashes (\n). You can check the table below to know more about escape characters.

Python’s escape characters

Apart from using quotes, you can also create strings using the built-in function str(). Other data types, such as integer, float, or boolean, can also be converted to strings using the str() function.

>>> str(10_000_000) 
'10000000'

>>> str(3.141592653589793)
'3.141592653589793'

>>> str(False)
'False'

Mathematical Operations on String

Only two types of mathematical operations can be applied to strings, namely the multiplication operation * and the addition operation +. The addition operation is usually used to concatenate a string, while the multiplication operation is usually used to duplicate strings.

>>> print("Cat " + "and" + " Dog") #addition operation
Cat and Dog

>>> name = "William"
>>> job = "Data scientist"
>>> print(name + " now working as " + job)
William now working as Data scientist

>>> print('python ' * 3) #multiplication operation
python python python

If you use mathematical operations other than addition and multiplication, it will return an error.

>>> print("Cat" - "Dog")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for -: 'str' and 'str'

Indexing and Slicing

Because the string is a sequence of characters, therefore we can access characters from a string using the bracket operator ([...]). To do indexing and slicing, you must specify its offset inside square brackets after the string’s name or variable name containing a string.

>>> animal = "Crocodile"
>>> animal[0] #get the first character
'C'

>>> animal[-1] #get the last character
'e'

>>> animal[5] #get the sixth character
'd'

If you specify an offset that is longer than the length of the string, it will return an error. You can also find out the length of your string using the len() function.

>>> animal[10] #will return an error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: string index out of range

>>> len(animal)
9

Just like indexing, slicing a substring also uses the bracket operator. When you are slicing substrings, in the square brackets you can specify a start offset, an end offset, and a step count between them (optional).

>>> sentence = "python programming language"
>>> sentence[:] #get an entire string
'python programming language'

>>> sentence[7:]
'programming language'

>>> sentence[:18]
'python programming'

>>> sentence[7:18]
'programming'

>>> sentence[3:20:2]
'hnpormigl'

>>> sentence[::-1] #reverse a string
'egaugnal gnimmargorp nohtyp'

It should be noted here that the value for indexing and slicing must be an integer. If you use float, it will return an error.

>>> letter = 'brawijaya university'
>>> letter[3.5]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: string indices must be integers

>>> letter[1:5.5]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: slice indices must be integers or None or have an __index__ method

String Methods

When we create a string, it is actually an instance of the String class. Because python is object-oriented programming, everything in python is an object. Each object or instance has a specific method that can be used. We can’t use the integer object method for string objects, it will return an error.

>>> type(letter)
<class 'str'>

String does have a lot of methods, but you don’t need to memorize all of its behavior, and if you forget you can take a look at the documentation. In practice, there may be several methods that are often used, and you should be familiar with them. I’ll cover a few methods that might be applied commonly.

If you want to set the capitalization of the text, you can use the .upper() method to convert a string into upper case, the .lower() method to convert a string into lowercase, and the .capitalize() method to convert the first character to upper case [3].

>>> letter = "cOmPuTeR sCiEnCe"
>>> letter.upper()
'COMPUTER SCIENCE'

>>> letter.lower()
'computer science'

>>> letetr.capitalize()
'Computer science'

When you want to split a string into a substring you can use the .split() method and if you want to combine a string you can use the .join() method [3].

>>> letter = "I found a python in the forest"
>>> letter.split()
['I', 'found', 'a', 'python', 'in', 'the', 'forest']

>>> fruit = ["banana", "apple", "watermelon"]
>>> ",".join(fruit)
'banana,apple,watermelon'

Strings are immutable, which means you cannot modify existing strings. However, you can use the .replace() method for simple substring substitution, but it still won’t replace the existing string [3].

>>> sentence= "Anthony eats pizza"
>>> sentence.replace('pizza', 'banana')
'Anthony eats banana'

>>> sentence
'Anthony eats pizza'

Furthermore, if you want to count certain letters or words in a string, you can use .count()[3].

>>> sentence = "I'm camping in a random forest"
>>> sentence.count('a')
3

>>> fruit = "apple, banana, apple, watermelon, apple"
>>> fruit.count("apple")
3

String Formatting

You may be able to use the + sign to concatenate a string. However, in practice, it might make your syntax difficult to read and debug. In python, there are 3 different ways to format strings, including the old style (supported in Python 2 and 3), the new style (Python 2.6 and up), and f-strings (Python 3.6 and up) [2].

First, we’ll attempt to format strings using the old style(%). When we use the old style, then we need to use the % sign followed by special letters for different data types.

>>> name = "anthony"
>>> age = 20
>>> print('his name is %s and now he is %s years old' % (name, age))
his name is anthony and now he is 20 years old

>>> print('the book costs, %d' % 10_000)
the book costs, 10000

>>> print('pi has a value of %f' % 3.141593)
pi has a value of 3.141593

In the example above, when we use %s, python will convert all data types to strings. Meanwhile, when we want an integer data type, then use %d and %f for the float data type. For more detailed information, you can see the table below.

Then we will use the new style in string formatting. When we use the new style, then the writing code must follow the rules like this string{}.format(data).

>>> name = "Jonathan"
>>> age = 35
>>> print('{}.format(name)')
Jonathan

>>> print('{} now {} years old'.format(name, age))
Jonathan now 35 years old

>>> print('{name} have 3 {fruit}'.format(name='Jony', fruit='apples'))
Jony have 3 apples

Actually, this new style rule is almost like the old style. Things to note here are the use of the {} and .format(). But in python 3 and up there is an f-string. Writing string formatting is highly recommended to use f-sting because it is very easy to read and debug.

They are called f-strings because you need to prefix a string with the letter ‘f’ or ‘F’ to create an f-string. Almost like writing with the new style, but the f-string does not require .format(). To use f-sting, you must begin with the letter ‘f’ before single or double quotes and inside quotes, you can add {...} that can refer to variables or literal values. Here is an example:

>>> name = "Jonathan"
>>> age = 35
>>> print('{name} now {age} years old')
Jonathan now 35 years old

>>> print(f'the result of 5 plus 10 is {5+10}')
the result of 5 plus 10 is 15

>>> school = 'brawijaya university'
>>> print(f'I am now studying at {school.upper()}')
I am now studying at BRAWIJAYA UNIVERSITY

Go to this web if you want to learn more about string formatting.

Conclusion

A string is a data type in Python that is a sequence of characters. In this article we get to know about strings in python, starting from how to create strings, mathematical operations on strings, indexing and slicing, string methods, and string formatting.

References:

[1] Downey, A. B. (2015). Think Python: How to Think Like a Computer Scientist. O’Reilly Media.

[2] Lubanovic, B. (2019). Introducing Python: Modern Computing in Simple Packages. O’Reilly Media.

[3] https://docs.python.org/3/library/stdtypes.html#string-methods

From Infosec Writeups: A lot is coming up in the Infosec every day that it’s hard to keep up with. Join our weekly newsletter to get all the latest Infosec trends in the form of 5 articles, 4 Threads, 3 videos, 2 Github Repos and tools, and 1 job alert for FREE! https://weekly.infosecwriteups.com/

--

--