Machine Learning From Scratch [Part 1]

This is part one of Machine Learning from Scratch

In this lesson, you’ll learn how to:

  • Import a module from a bigger library
  • Start working with Matplotlib and Pyplot
  • Declare lists of data
  • Generate a line chart (X and Y axis) from the lists
  • Generate a bar chart

Discover the power of data by implementing machine learning algorithms in Python. Here, I’ll show you the logic behind each technique, and you are going to be able to apply machine learning in different situations.

No more talking, let’s get straight to it.

Assuming that you have Anaconda and Jupyter Notebooks installed, create a new notebook.

Let’s import the pyplot module from the library matplotlib. Pyplot is useful for generating simple charts from data. It’s not recommended for heavy-duty data visualizations – you wouldn’t use it live in a web dashboard.

#For making simple plots
from matplotlib import pyplot as plt

Now, let’s declare two lists – each one containing 7 elements. You’ll notice that their elements are corresponding. years[0] is related to gdp[0] – that’s for all lists’ elements.

years = [1950, 1960, 1970, 1980, 1990, 2000, 2010]

gdp = [300.2, 543.3, 1075.9, 2862.5, 5979.6, 10289.7, 14958.3]

Now, using pyplot, let’s plot a line chart.

X-axis: years

Y-axis: gdp

Take a close look at plt.plot syntax. The attribute on the X-axis goes first, the Y-axis goes second. Then, you select the attributes you want:

  • color
  • marker (‘o’ means a circle as indicator in the chart)
  • linestyle
#create a line chart. Years on x-axis, gdp on y-axis

plt.plot(years, gdp, color = 'green', marker = 'o', linestyle = 'solid')

#add a title
plt.title("Nominal GDP")

Now, let’s add a title to our chart and print it right into Jupyter notebook:

#add a label to the y-axis
plt.ylabel("Billions of $")
This is the output you should see

Pyplot is a simple and fast solution to generate visualizations from data.

In business, you need to be agile. Pyplot charts may not be that good looking or interactive, but they will certainly do their job.

You don’t need to memorize each parameter for a function. For example, put your mouse cursor next to plt.plot() and press shift + tab. The docstring of the function will pop into your screen:

Here they are: all possible parameters your function might receive. If you don’t specify all of them (apart from x-axis and y-axis) the default values will be used

Now, let’s learn how to plot a bar chart.

Bar charts are useful when when you want to show how some quantity varies among some discrete set of items.

Discrete items are not continuous values – which means that they are not a progression of numbers.

We want to visualize the names and heights in meters of the tallest buildings in the world. After a quick Google search, you will come up with two lists of corresponding items: building_names and heights

building_names = ["Burj Khalifa", "Shanghai Tower", "Makkah Tower", "Ping An Financial Center"]
heights = [828, 632, 601, 555]

As you’ve declared Pyplot previously, it’s already instantiated into your Jupyter Notebook, so there’s no need to declare it again. If you’ve close this notebook, you will have to execute the import statement again.

If you type in and press shift+tab, the docstring of the function will pop into your screen:

Again, you don’t need to memorize the parameters each function receives.

To make the bar chart look good, we might want to set up that the length of each bar has the same length of the name of the building. Also, we’ll set the bars’ heights. As we are talking about a range of values, we might simply call range:, heights)

Let’s add titles to our bar chart and y-axis:

plt.title("Tallest buildings in the world") #add a title
plt.ylabel("#height in meters") # label the y-axis

To add labels to our X-axis, we’ll call xticks:

plt.xticks(range(len(building_names)), building_names) will literally show our bar chart which must look like this:

We’ve just generated a bar chart using Pyplot. Note that the titles are messy thanks to their large names. Pyplot is fast but not pixel perfect. Deal with it.

That’s good for now. I believe that short tutorials are more productive than larger ones.

On the next tutorial of Machine Learning from Scratch we’ll keep playing around with Pyplot, collections, histograms and line charts.


Leave a Reply

Your email address will not be published. Required fields are marked *