Visualization Matplotlib#

Section Title: Visualization Matplotlib

In this section, we’ll use matplotlib.pyplot, which has been imported and used (aliased) as plt.

import matplotlib.pyplot as plt

Scatterplot#

  • For a single point, provide (x_coordinate, y-coordinate).

  • For multiple points, provide (x_coordinate sequence, y-coordinate sequence), where the sequence can be a tuple, list, or series.

# single point
plt.scatter(5,10);    # x=5, y=10
_images/cfffa6c0db918cbc70122f45ddc755773e67fa14543f54546b45342017019e06.png

If multiple points are given with their coordinates as follows:

  • x_coordinate sequence: [2, 4, 5]

  • y_coordinate sequence: [10, 3, 6]

  • In this setup:

    • 2 corresponds to the x-coordinate of the first point, and 10 corresponds to its y-coordinate.

    • 4 corresponds to the x-coordinate of the second point, and 3 corresponds to its y-coordinate.

    • 5 corresponds to the x-coordinate of the third point, and 6 corresponds to its y-coordinate.

In the following code, the points are: \((2,10), (4,3), (5,6)\).

# multiple points
plt.scatter([2,4,5],[10,3,6]);  
_images/109e8cc6a544bb0b99e8c36aadcbd8f8bf0dbfb166e61d8f5b16bbd4f85c4cd8.png

Figure Size#

The size of a chart figure can be adjusted using the figsize=(width, height) parameter of plt.figure().

plt.figure(figsize=(10,3))    # width:10, height:3
plt.scatter([2,4,5],[10,3,6]);    
_images/e34db20c4a1a8b3c59c8fd7d5ff370c3b30d499196dca5fb3b130c44a189b78e.png

Color#

The c or color parameter is used to modify the color of points (markers).

plt.scatter([2,4,5],[10,3,6], c='red');    # instead of 'red' it is enough to write 'r' 
_images/418d82b238ebfc9049e154ec657163f48a6d0ad0c92fefea03a4418f3178d013.png
plt.scatter([2,4,5],[10,3,6], color='r');    # color parameter
_images/418d82b238ebfc9049e154ec657163f48a6d0ad0c92fefea03a4418f3178d013.png

Size#

The s is used to modify the size of points (markers).

plt.scatter([2,4,5],[10,3,6], s=300);    # size=300
_images/e613136359fa792c10d356db94d9d5510bb42b09c0b2d4814f1aa7e05a19fb67.png

Markers and Scatters#

The marker parameter is used to modify the style of points (markers).

For more marker styles check:

# plus symbol
plt.scatter([2,4,5],[10,3,6], marker='+');
_images/32c23a7384067686e0a84c30bbcddace446e6287b66650138c6374db4bfff145.png
# square
plt.scatter([2,4,5],[10,3,6], marker='s');
_images/7528bf9ecd11922e9199c5d989135892405994601e43002ae1ffa6cf4f58d27f.png

Transparency#

The alpha parameter is used to modify the transparency of points (markers).

  • alpha is a number between 0 (transparent) and 1 (opaque).

plt.scatter([2,4,5],[10,3,6], alpha=0.3);
_images/234dc1db1110a860b6c62e821e12583e293e06f5e04bfa729fdf86b97a74f9dc.png

Linewidth and Scatters#

The linewidth and edgecolor parameters are used to modify the boundary of points (markers).

plt.scatter([2,4,5],[10,3,6], c='r', s=200, linewidths=5, edgecolor='b');
_images/346695939dbc39af47b08573dd3f20f2ee865425cec01cbb8865e4e3d69591dc.png

Title#

The plt.title() is used to add a title to the figure.

plt.title('Three Square Points')
plt.scatter([2,4,5],[10,3,6], c='r', s=200, marker='s');
_images/232daa681f09282e79c7404f629d06eab12a2ef5cd0383ffa96ea136efae6dfe.png
# pad adds a padding below the title
# fontsize and location (loc) can be modified.

plt.title('Three Points', fontsize=20, loc='right', pad=100, c='b')
plt.scatter([2,4,5],[10,3,6], c='r', s=200, marker='s');
_images/287512ec1cc0f73f0cf26d57d61ce97e1201162d26ce05fb389dd14c90b0bd6d.png

Suptitle#

The plt.suptitle() is used to add a suptitle to the figure.

plt.title('Three Points', fontsize=20, loc='right', pad=100, c='b')
plt.suptitle('Different Colors', fontsize=15, c='g', y=1)
plt.scatter([2,4,5],[10,3,6], c='r', s=200, marker='s');
_images/1c58d9d1e4cbf5bc2bb232f1a04a4c45a5f3315f9168e3fdcb5619e8a910c4d3.png

Grid#

The plt.grid() is used to add horizontal and vertical grids to the figure.

plt.scatter([2,4,5],[10,3,6])
plt.grid();
_images/9034592daf366002dd4bf7b2b7f04b7b389cd64b903c4465a20cbf896a51c1ff.png
# only vertical
plt.scatter([2,4,5],[10,3,6])
plt.grid(axis='x');
_images/d9675d3ba2046e6bcbee507651db4b99c86147a0352d57f1d973fe3a5310ce50.png
# only horizontal
plt.scatter([2,4,5],[10,3,6])
plt.grid(axis='y');
_images/fb74c3e6149936d3d646cc32a96333697030345dc1bd21e952d0c734834f015b.png

Legend#

The plt.legend() is used to add a legend to the figure.

plt.scatter([2,4,5],[10,3,6], c='r', label='reds')
plt.scatter([5,1,9],[4,8,7],  c='g', label='greens')
plt.legend();
_images/1339553539018d6ec0d52849e20dba94fdf655256836b17d9de4d5871f578d54.png
# location of legend: lower left

plt.scatter([2,4,5],[10,3,6], c='r', label='reds')
plt.scatter([5,1,9],[4,8,7],  c='g', label='greens')
plt.legend(loc='lower left');
_images/341bc554af1b5d3e930a32ae7a79a1f7187dc21100864fbab73574f54721fcbc.png

Axis Labels#

The plt.xlabel() and plt.ylabel() are used to add x-axis and y-axis labels to the figure.

plt.scatter([2,4,5],[10,3,6])
plt.xlabel('x_coordiante')
plt.ylabel('y_coordiante');
_images/fd32f9b6b9d5dee8fb85ac517ada048429135abea463dbde87309341d8558996.png

Axis Ticks#

The plt.xticks() and plt.yticks() are used to modify the ticks of both the x and y axes, respectively.

plt.scatter([2,4,5],[10,3,6])
plt.xticks([2,4,5], ['Washington', 'Alabama','Virginia']);
_images/378900e973878f00bf0b3be81f66dc8173dad958d9362b5b99952643bfab1c15.png
# rotation
plt.scatter([2,4,5],[10,3,6])
plt.xticks([2,4,5], ['Washington', 'Alabama','Virginia'], rotation=90);
_images/56ebacfb45bc144584a57855a9770e6712ef3c42d633426974473973f2de2041.png

Text#

The plt.text()is used to add a text starting from the specified coordinate on the figure.

plt.scatter([2,4,5],[10,3,6])
plt.text(4.05,3,'Alabama');   # starts from the point (4.05,3)
_images/94ef21c818a761930fde77041bd3c9badd75af846befc35ad899773f21dd2f46.png

Facecolor#

The facecolor of a chart figure can be adjusted using the facecolor parameter of plt.figure().

plt.figure(facecolor='g')
plt.scatter([2,4,5],[10,3,6]);
_images/71ce8ad0b997bcc57aee341163a54a42445d1c538cb77b0cb600a8d21f72ac73.png
# background of the figure
plt.axes().set_facecolor('yellow')
plt.scatter([2,4,5],[10,3,6]);
_images/4bcced445e0a54d37a2755956184b9f04d39c2b0b85dc2734f845e046f883d1a.png

Subplot#

plt.subplot() is used to create multiple plots within a single figure. Here’s a breakdown of its parameters:

  • The first parameter specifies the number of rows of subplots.

  • The second parameter specifies the number of columns of subplots.

  • The third parameter specifies the order of the subplot, starting from 1 and increasing row-wise and then column-wise.

# number of rows: 1
# number of columns: 3

plt.figure(figsize=(20,5))
plt.subplot(1,3,1)
plt.scatter([2,4],[8,7],c='r')
plt.subplot(1,3,2)
plt.scatter([1,2],[5,12],c='b')
plt.subplot(1,3,3)
plt.scatter([6,10],[1,2],c='g');
_images/b1ba65f1896c44621f1ba00f5c3d079f0740b2432e3e97ee6b2bcb1d9443d097.png

Lineplot#

plt.plot() creates a line plot by connecting the given points using line segments.

plt.plot([1,3],[7,10]);  # start from (1,3) up to (3,10)
_images/24a5d327f6d3f26ce187f9d534174d1296f95b8cfe57d85d861347652b883730.png

Linewidth and Lines#

The linewidth is used to modify the width of the line.

plt.plot([1,3],[7,10], linewidth=7);
_images/a5a1134874fc19837736bc602fe1775ec021b731fa1ff8ed8a990cc8ebafa2ee.png

Color and Lines#

The c or color parameter is used to modify the color of the line.

plt.plot([1,3],[7,10], c='r');
_images/86dbe7618efbc30d70d7d5bd907a3cad6058471d2c8a722e0f03c02918bf2cd9.png

Linestyle and Lines#

The linestyle parameter is used to modify the style of the line.

plt.plot([1,7],[3,10], c='r', linestyle='dashed')
plt.plot([0,8],[12,3], c='g', linestyle='dotted');
_images/cff82dab9aea64e937353308dd3f5d51451b76edaec07d49091e1b7f1ce75563.png

Markers and Lines#

The marker parameter is used to specify the style of the markers, while the markersize parameter is used to adjust their size.

plt.plot([1,3],[7,10], c='r', marker='*', markersize=10);
_images/4f74bc8d4d5baeeb928cbb03b57183466fff3c39997c48e7d2f3420e6f7d0416.png
# multiple points
plt.plot([1,3,5,7,9,11,13,13],[7,10,1,8,3,6,2,1], c='r', marker='*', markersize=20);
_images/610b3e4e500ab6ff9a76fadc4abc9a9d56dd00a88b142ec104ca7a27b574ee94.png

Pie Chart#

plt.pie() is used to generate a pie chart. Here’s how the mentioned parameters work:

  • The radius parameter adjusts the size of the pie chart.

  • The autopct parameter is used to include and customize the display of percentage values.

    • ‘%.0f’: 0 reprecents the decimal place, ‘f’ represents float

expenses = ['housing', 'tuition', 'transportation', 'supplies', 'food']
cost = [700, 1000, 200, 100, 500 ]
color_set = ['y', 'r', 'g', 'orange', 'navy']
plt.pie(cost, colors = color_set, labels=expenses, radius=1, autopct='%.0f');
_images/cbbf02138e678be25153f7c634c05638ab2875ace9ec162838b54de49e421551.png

Barplot#

plt.bar() is used to generate a bar plot.

plt.bar(expenses, cost);
_images/5f91f271bc11bb4d134d6a0688bbb70b28afdfdc1e2f6bc3eba30b559395bad8.png
# horizontal
plt.barh(expenses, cost, color='r' );
_images/f632871a62f98218053bdc14b7812ba2926759a74a600c24c4405cc75ed9dcb0.png

Boxplot#

plt.boxplot() is used to generate a box plot, which represents the five-number summary:

  1. Minimum

  2. 1st quartile (median of the first half)

  3. 2nd quartile (median)

  4. 3rd quartile (median of the second half)

  5. Maximum

  • Open circles are used to represent outliers.

plt.boxplot([8,7,3,2,1,5,6,7,8,15]);
_images/00dd9148e9ed379454cc005e088d5de86135d2babe572e02bbe0315fd86889d4.png

Histogram#

plt.hist() is used to generate histograms, which represent the distribution of the data.

plt.hist([1,1,8,7,3,2,1,5,6,7,8,15,15,15,15]);
_images/1acd714e281c66fad84b36f2f25f0f7dd1438a78603aaf5ad04559aa05ed7045.png

Save a Figure#

  • You can save a figure to a local computer or Google drive.

Local Computer#

# save to a local computer
# same folder with your code
# name of the saveg picture is 'states_plot.png'

plt.scatter([2,4,5],[10,3,6])
plt.savefig('states_plot.png')
_images/109e8cc6a544bb0b99e8c36aadcbd8f8bf0dbfb166e61d8f5b16bbd4f85c4cd8.png
# save to a local computer
# to folder called data_files

plt.scatter([2,4,5],[10,3,6])
plt.savefig('data_files/states_plot.png')
_images/109e8cc6a544bb0b99e8c36aadcbd8f8bf0dbfb166e61d8f5b16bbd4f85c4cd8.png

Google Colab#

# mount
# connection between colab notebook and drive
from google.colab import drive
drive.mount('/content/drive')

# save it to a folder in My Drive called data_files
# the name of the saved picture is states_plot.png

plt.scatter([2,4,5],[10,3,6])
plt.savefig('/content/drive/My Drive/data_files/states_plot.png')