# 12. Libraries

The built-in Python modules do a lot that you need, but there are many external Application Programming Interfaces (APIs), libraries or modules. We will look at some of these external libraries.

## 12.1. pip

One popular way to install external libraries is through `pip`

. `pip`

is a commandline tool that allows you to install Python libraries from the Python Package Index `PyPi`

. In order to install a Python library from PyPi, all you need to know is the package name, e.g. `pandas`

, and then you can issue the installation as follows.

```
pip install <package_name>
```

You can also install multiple packages in one line.

```
pip install <package_name_1> <package_name_2>
```

Note

`pip`

will work its hardest to resolve `transitive`

dependencies and bring those in. Transitive dependencies are those that a package you are trying to install depends on to work.

## 12.2. Pandas

```
pip install pandas
```

`Pandas`

is a library for interacting with data. Writing CSV files is easy using Pandas.

```
1import pandas as pd
2import random
3
4data = [[random.randint(0, 101) for _ in range(10)] for _ in range(10)]
5
6df = pd.DataFrame(data, columns=[f'x{i}' for i in range(10)])
7print(df.shape)
8
9df.to_csv('test.csv', header=True, index=False)
```

Reading data from a CSV using Pandas is just as easy.

```
1import pandas as pd
2
3df = pd.read_csv('test.csv')
4
5print(df.shape)
```

## 12.3. Numpy

```
pip install numpy scipy
```

`Numpy`

is a numerical library. `SciPy`

builds on numpy and is a general purpose scientific computing library. If we wanted to draw samples from a normal distribution centered on 0 with a scale of 1, \(\mathcal{N}(0, 1)\), we can use the `normal()`

function.

```
from numpy.random import normal
values = normal(0, 1, 100)
print(values)
```

## 12.4. Scikit-Learn

```
pip install scikit-learn
```

`Scikit-Learn`

is a data science library. We can use this library to learn predictive models, generate data, transform data and so on.

```
from sklearn.datasets import make_regression
X, y = make_regression(**{
'n_samples': 1000,
'n_features': 50,
'n_informative': 10,
'n_targets': 1,
'bias': 5.3,
'random_state': 37
})
print(f'X shape = {X.shape}, y shape {y.shape}')
```

## 12.5. joblib

```
pip install joblib
```

`Joblib`

is an library to make multi-core processing easier in Python.

```
from math import sqrt
from joblib import Parallel, delayed
results = Parallel(n_jobs=2)(delayed(sqrt) (i ** 2) for i in range(10))
print(results)
```