* Edit Jan 2021: I recently completed a YouTube video covering topics in this post:

In the last lesson we learned that Python can act as a powerful calculator, but numbers are just one of many basic data types you'll encounter in data analysis. A solid understanding of basic data types is essential for working with data in Python.

Integers

Integers or "ints" for short, are whole-numbered numeric values. Any positive or negative number (or 0) without a decimal is an integer in Python. Integer values have unlimited precision, meaning an integer is always exact. You can check the type of a Python object with the type() function. Let's run type() on an integer:

In [1]:

type(12)

Out[1]:

int

Above we see that the type of "12" is of type "int". You can also use the function isinstance() to check whether an object is an instance of a given type:

In [2]:

# Check if 12 is an instance of type "int"

isinstance(12, int)

Out[2]:

True

The code output True confirms that 12 is an int.

Integers support all the basic math operations we covered last time. If a math operation involving integers would result in a non-integer (decimal) value, the result is becomes a float:

In [3]:

1/3  # A third is not a whole number*

Out[3]:

0.3333333333333333

In [4]:

type(1/3)  # So the type of the result is not an int

Out[4]:

float

*Note: In Python 2, integer division performs floor division instead of converting the ints to floats as we see here in Python 3, so 1/3 would return 0 instead of 0.3333333.

Floats

Floating point numbers or "floats" are numbers with decimal values. Unlike integers, floating point numbers don't have unlimited precision because irrational decimal numbers are infinitely long and therefore can't be stored in memory. Instead, the computer approximates the value of long decimals, so there can be small rounding errors in long floats. This error is so minuscule it usually isn't of concern to us, but it can add up in certain cases when making many repeated calculations.

Every number in Python with a decimal point is a float, even if there are no non-zero numbers after the decimal:

In [5]:

type(1.0)

Out[5]:

float

In [6]:

isinstance(0.33333, float)

Out[6]:

True

The arithmetic operations we learned last time work on floats as well as ints. If you use both floats and ints in the same math expression the result is a float:

In [7]:

5 + 1.0

Out[7]:

6.0

You can convert a float to an integer using the int() function:

In [8]:

int(6.0)

Out[8]:

You can convert an integer to a float with the float() function:

In [9]:

float(6)

Out[9]:

6.0

Floats can also take on a few special values: Inf, -Inf and NaN. Inf and -Inf stand for infinity and negative infinity respectively and NaN stands for "not a number", which is sometimes used as a placeholder for missing or erroneous numerical values.

In [10]:

type ( float ("Inf") )

Out[10]:

float

In [11]:

type ( float ("NaN") )

Out[11]:

float

*Note: Python contains a third, uncommon numeric data type "complex" which is used to store complex numbers.

Booleans

Booleans or "bools" are true/false values that result from logical statements. In Python, booleans start with the first letter capitalized so True and False are recognized as bools but true and false are not. We've already seen an example of booleans when we used the isinstance() function above.

In [12]:

type(True)

Out[12]:

bool

In [13]:

isinstance(False, bool)  # Check if False is of type bool

Out[13]:

True

You can create boolean values with logical expressions. Python supports all of the standard logic operators you'd expect:

In [14]:

# Use >  and  < for greater than and less than:
    
20>10

Out[14]:

True

In [15]:

20<5

Out[15]:

False

In [16]:

# Use >= and  <= for greater than or equal and less than or equal:

20>=20

Out[16]:

True

In [17]:

30<=29

Out[17]:

False

In [18]:

# Use == (two equal signs in a row) to check equality:

10 == 10

Out[18]:

True

In [19]:

"cat" == "cat"

Out[19]:

True

In [20]:

True == False

Out[20]:

False

In [21]:

40 == 40.0  # Equivalent ints and floats are considered equal

Out[21]:

True

In [22]:

# Use != to check inequality. (think of != as "not equal to")

1 != 2

Out[22]:

True

In [23]:

10 != 10

Out[23]:

False

In [24]:

# Use the keyword "not" for negation:

not False

Out[24]:

True

In [25]:

not (2==2)

Out[25]:

False

In [26]:

# Use the keyword "and" for logical and:

(2 > 1) and (10 > 9)

Out[26]:

True

In [27]:

False and True

Out[27]:

False

In [28]:

# Use the keyword "or" for logical or:

(2 > 3) or (10 > 9)

Out[28]:

True

In [29]:

False or True

Out[29]:

True

Similar to math expressions, logical expressions have a fixed order of operations. In a logical statement, comparisons like >, < and == are executed first, followed by "not", then "and" and finally "or". See the following link to learn more: Python operator precedence. Use parentheses to enforce the desired order of operations.

In [30]:

2 > 1 or 10 < 8 and not True

Out[30]:

True

In [31]:

((2 > 1) or (10 < 8)) and not True

Out[31]:

False

You can convert numbers into boolean values using the bool() function. All numbers other than 0 convert to True:

In [32]:

bool(1)

Out[32]:

True

In [33]:

bool(-12.5)

Out[33]:

True

In [34]:

bool(0)

Out[34]:

False

Strings

Text data in Python is known as a string or str. Surround text with single or double quotation marks to create a string:

In [35]:

type("cat")

Out[35]:

str

In [36]:

type('1')

Out[36]:

str

In [37]:

isinstance("hello!", str)

Out[37]:

True

You can define a multi-line string using triple quotes:

In [38]:

print( """This string spans
multiple lines """ )

This string spans
multiple lines

You can convert numbers from their integer or float representation to a string representation and vice versa using the int(), float() and str() functions:`

In [39]:

str(1)          # Convert an int to a string

Out[39]:

'1'

In [40]:

str(3.333)      # Convert a float to a string

Out[40]:

'3.333'

In [41]:

int('1')        # Convert a string to an int

Out[41]:

In [42]:

float('3.333')  # Convert a string to a float

Out[42]:

3.333

Two quotation marks right next to each other (such as '' or "") without anything in between them is known as the empty string. The empty string often represents a missing text value.

Numeric data and logical data are generally well-behaved, but strings of text data can be very messy and difficult to work with. Cleaning text data is often one of the most laborious steps in preparing real data sets for analysis. We will revisit strings and functions to help you clean text data in future lesson.

None

In Python, "None" is a special data type that is often used to represent a missing value. For example, if you define a function that doesn't return anything (does not give you back some resulting value) it will return "None" by default.

In [43]:

type(None)

Out[43]:

NoneType

In [44]:

# Define a function that prints the input but returns nothing*

def my_function(x):
    print(x)
    
my_function("hello") == None  # The output of my_function equals None

hello

Out[44]:

True

*Note: We will cover defining custom functions in detail in a future lesson.

Wrap Up

This lesson covered the most common basic data types in Python, but it is not an exhaustive list of Python data objects or the functions. The Python language's official documentation has a more thorough summary of built-in types, but it is a bit more verbose and detailed than is necessary when you are first getting started with the language.

Now that we know about the basic data types, it would be nice to know how to save values to use them later. We'll cover that in the next lesson.

Life Is Study

Monday, October 26, 2015

Python for Data Analysis Part 3: Basic Data Types

Integers

Floats

Booleans

Strings

None

Wrap Up

Next Time: Python for Data Analysis Part 4: Variables

No comments:

Post a Comment