You are aware that the Python list is pretty powerful: A list can hold any type and can hold different types at the same time. You can also change, add and remove elements.
This is wonderful, but one feature is missing.
When analyzing data, you'll often want to carry out operations over entire collections of values, and you want to do this fast. With lists, this is a problem.
Let's consider we have two lists
If you now want to calculate the Body Mass Index for each family member, you'd hope that this call can work, making the calculations element-wise.
Unfortunately, Python throws an error, because it has no idea how to do calculations with lists.
You could solve this by going through each list element one after the other, and calculating the BMI for each person separately, but this is terribly inefficient and tiresome to write.
A way more elegant solution is to use NumPy, or Numeric Python. It's a Python package that, among others, provides an alternative to the regular python list: the Numpy array.
Next, these regular lists were converted to Numpy arrays. The same operations now work without any problem: Numpy knows how to work with arrays as if they are single values
If you do try to create an array with different types, like this for example, the resulting Numpy array will contain a single type, string in this case. The boolean and the float were both converted to strings.
Second, you should know that a Numpy array is simply a new kind of Python type, like the float, string and list types from before
Take this Python list and this numpy array, for example:
This is wonderful, but one feature is missing.
When analyzing data, you'll often want to carry out operations over entire collections of values, and you want to do this fast. With lists, this is a problem.
Let's consider we have two lists
If you now want to calculate the Body Mass Index for each family member, you'd hope that this call can work, making the calculations element-wise.
Unfortunately, Python throws an error, because it has no idea how to do calculations with lists.
You could solve this by going through each list element one after the other, and calculating the BMI for each person separately, but this is terribly inefficient and tiresome to write.
A way more elegant solution is to use NumPy, or Numeric Python. It's a Python package that, among others, provides an alternative to the regular python list: the Numpy array.
The Numpy
array is pretty similar to a regular Python list, but has one additional
feature: you can perform calculations over all entire arrays.
It's really easy, and super-fast as well.
Let's start with _creating_ a numpy array. You do this with Numpy's `array()` function: the input is a regular Python list. I'm using `array()` twice here, to create Numpy versions of the `height` and `weight` lists.
It's really easy, and super-fast as well.
Let's try to calculate everybody's BMI with a single call
again:
First,
we tried to do calculations with regular lists, like this, but this gave us an
error, because Python doesn't now how to do calculations with lists like we
want them to.
Next, these regular lists were converted to Numpy arrays. The same operations now work without any problem: Numpy knows how to work with arrays as if they are single values
First
of all, Numpy can do all of this so easily because it assumes that your Numpy
array can only contain values of a single type. It's either an array of floats,
either an array of booleans, and so on.
If you do try to create an array with different types, like this for example, the resulting Numpy array will contain a single type, string in this case. The boolean and the float were both converted to strings.
Second, you should know that a Numpy array is simply a new kind of Python type, like the float, string and list types from before
Take this Python list and this numpy array, for example:
If you do `python_list + python_list`, the list elements are
pasted together, generating a list with 6 elements.
If you do this with the numpy arrays, on the other
hand, Python will do an element-wise sum of the array:
Specifically for Numpy, there's also another way to do list
subsetting: using an array of booleans. Say you want to get all BMI values in
the bmi arrays that are over 23. A first step is using the greater than sign,
like this:
The result is a Numpy array containing booleans: True if the
corresponding bmi is above 23, False if it's below.
Next,
you can use this boolean array inside square brackets to do subsetting
- · The Numpy Package provides the array, a data type that can be used to do element-wise calculations.
- · Because Numpy arrays can only hold element of a single type, calculations on Numpy arrays can be carried out way faster than regular Python lists.
0 comments:
Post a Comment