Jacqueline Routhier

1 / 8

Present

My Final Code is

#AT Math Unit 2
#pandas, numpy, and matplotlib.pyplot are libraries (add-ons)
import pandas as pd
import matplotlib.pyplot as plt

#this allows you to read the data from set
data = pd.read_csv('co2_emissions.csv', na_values=None, low_memory=False)

#read the 1990-2012 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_1990 = pd.to_numeric(data['1990'], errors='coerce')
data_1991 = pd.to_numeric(data['1991'], errors='coerce')
data_1992 = pd.to_numeric(data['1992'], errors='coerce')
data_1993 = pd.to_numeric(data['1993'], errors='coerce')
data_1994 = pd.to_numeric(data['1994'], errors='coerce')
data_1995 = pd.to_numeric(data['1995'], errors='coerce')
data_1996 = pd.to_numeric(data['1996'], errors='coerce')
data_1997 = pd.to_numeric(data['1997'], errors='coerce')
data_1998 = pd.to_numeric(data['1998'], errors='coerce')
data_1999 = pd.to_numeric(data['1999'], errors='coerce')
data_2000 = pd.to_numeric(data['2000'], errors='coerce')
data_2001 = pd.to_numeric(data['2001'], errors='coerce')
data_2002 = pd.to_numeric(data['2002'], errors='coerce')
data_2003 = pd.to_numeric(data['2003'], errors='coerce')
data_2004 = pd.to_numeric(data['2004'], errors='coerce')
data_2005 = pd.to_numeric(data['2005'], errors='coerce')
data_2006 = pd.to_numeric(data['2006'], errors='coerce')
data_2007 = pd.to_numeric(data['2007'], errors='coerce')
data_2008 = pd.to_numeric(data['2008'], errors='coerce')
data_2009 = pd.to_numeric(data['2009'], errors='coerce')
data_2010 = pd.to_numeric(data['2010'], errors='coerce')
data_2011 = pd.to_numeric(data['2011'], errors='coerce')
data_2012 = pd.to_numeric(data['2012'], errors='coerce')
#drop all the missing number
clean_1991 = data_1991.dropna()
clean_1990 = data_1990.dropna()
clean_1992 = data_1992.dropna()
clean_1993 = data_1993.dropna()
clean_1994 = data_1994.dropna()
clean_1995 = data_1995.dropna()
clean_1996 = data_1996.dropna()
clean_1997 = data_1997.dropna()
clean_1998 = data_1998.dropna()
clean_1999 = data_1999.dropna()
clean_2000 = data_2000.dropna()
clean_2001 = data_2001.dropna()
clean_2002 = data_2002.dropna()
clean_2003 = data_2003.dropna()
clean_2004 = data_2004.dropna()
clean_2005 = data_2005.dropna()
clean_2006 = data_2006.dropna()
clean_2007 = data_2007.dropna()
clean_2008 = data_2008.dropna()
clean_2009 = data_2009.dropna()
clean_2010 = data_2010.dropna()
clean_2011 = data_2011.dropna()
clean_2012 = data_2012.dropna()

#below makes the scatterplot
#giving values to x and y
x=1990,1990, 1990, 1990, 1990,1991, 1991,1991, 1991, 1991, 1992, 1992, 1992, 1992, 1992, 1993, 1993, 1993, 1993, 1993, 1994, 1994, 1994, 1994, 1994, 1995, 1995, 1995, 1995, 1995, 1996, 1996, 1996, 1996, 1996, 1997, 1997, 1997, 1997, 1997, 1998, 1998, 1998, 1998, 1998, 1999 ,1999, 1999, 1999, 1999, 2000, 2000, 2000, 2000, 2000, 2001, 2001, 2001, 2001, 2001, 2002, 2002, 2002, 2002, 2002,2003,2003,2003,2003,2003,2004,2004,2004,2004,2004,2005,2005,2005,2005,2005, 2006,2006,2006,2006,2006, 2007,2007,2007,2007,2007,2008,2008,2008,2008,2008,2009,2009,2009,2009,2009,2010,2010, 2010, 2010, 2010, 2011,2011,2011,2011,2011, 2012, 2012,2012,2012,2012
y= clean_1990.head(5), clean_1991.head(5), clean_1992.head(5), clean_1993.head(5),clean_1994.head(5),clean_1995.head(5),clean_1996.head(5),clean_1997.head(5),clean_1998.head(5), clean_1999.head(5), clean_2000.head(5), clean_2001.head(5), clean_2002.head(5), clean_2003.head(5), clean_2004.head(5), clean_2005.head(5), clean_2006.head(5), clean_2007.head(5), clean_2008.head(5), clean_2009.head(5), clean_2010.head(5),clean_2011.head(5),clean_2012.head(5)
plt.xlabel("Year")
plt.ylabel("CO2 Emissions (Tonnes per Person)")
plt.title("CO2 Emissions from 1990 to 2012")
plt.scatter(x,y, color='k')
plt.show()

My Semi Cleaned up Code is

#AT Math Unit 2
#pandas, numpy, and matplotlib.pyplot are libraries (add-ons)
import pandas as pd
import matplotlib.pyplot as plt

#this allows you to read the data from set
data = pd.read_csv('co2_emissions.csv', na_values=None, low_memory=False)

#read the 1990 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_1990 = pd.to_numeric(data['1990'], errors='coerce')
#drop all the missing number

#read the 1991 column as a numeric data
data_1991 = pd.to_numeric(data['1991'], errors='coerce')
#drop all the missing number
clean_1991 = data_1991.dropna()

#read the 1992 column as a numeric data
data_1992 = pd.to_numeric(data['1992'], errors='coerce')
#drop all the missing number
clean_1992 = data_1992.dropna()

#read the 1993 column as a numeric data
data_1993 = pd.to_numeric(data['1993'], errors='coerce')
#drop all the missing number
clean_1993 = data_1993.dropna()

#read the 1994 column as a numeric data
data_1994 = pd.to_numeric(data['1994'], errors='coerce')
#drop all the missing number
clean_1994 = data_1994.dropna()

#read the 1995 column as a numeric data
data_1995 = pd.to_numeric(data['1995'], errors='coerce')
#drop all the missing number
clean_1995 = data_1995.dropna()

#read the 1996 column as a numeric data
data_1996 = pd.to_numeric(data['1996'], errors='coerce')
#drop all the missing number
clean_1996 = data_1996.dropna()

#read the 1997 column as a numeric data
data_1997 = pd.to_numeric(data['1997'], errors='coerce')
#drop all the missing number
clean_1997 = data_1997.dropna()

#read the 1998 column as a numeric data
data_1998 = pd.to_numeric(data['1998'], errors='coerce')
#drop all the missing number
clean_1998 = data_1998.dropna()

#read the 1992 column as a numeric data
data_1999 = pd.to_numeric(data['1999'], errors='coerce')
#drop all the missing number
clean_1999 = data_1999.dropna()

#read the 2000 column as a numeric data
data_2000 = pd.to_numeric(data['2000'], errors='coerce')
#drop all the missing number
clean_2000 = data_2000.dropna()

#read the 2001 column as a numeric data
data_2001 = pd.to_numeric(data['2001'], errors='coerce')
#drop all the missing number
clean_2001 = data_2001.dropna()

#read the 2002 column as a numeric data
data_2002 = pd.to_numeric(data['2002'], errors='coerce')
#drop all the missing number
clean_2002 = data_2002.dropna()

#read the 2003 column as a numeric data
data_2003 = pd.to_numeric(data['2003'], errors='coerce')
#drop all the missing number
clean_2003 = data_2003.dropna()

#read the 2004 column as a numeric data
data_2004 = pd.to_numeric(data['2004'], errors='coerce')
#drop all the missing number
clean_2004 = data_2004.dropna()

#read the 2005 column as a numeric data
data_2005 = pd.to_numeric(data['2005'], errors='coerce')
#drop all the missing number
clean_2005 = data_2005.dropna()

#read the 2001 column as a numeric data
data_2006 = pd.to_numeric(data['2006'], errors='coerce')
#drop all the missing number
clean_2006 = data_2006.dropna()

#read the 2007 column as a numeric data
data_2007 = pd.to_numeric(data['2007'], errors='coerce')
#drop all the missing number
clean_2007 = data_2007.dropna()

#read the 2008 column as a numeric data
data_2008 = pd.to_numeric(data['2008'], errors='coerce')
#drop all the missing number
clean_2008 = data_2008.dropna()

#read the 2009 column as a numeric data
data_2009 = pd.to_numeric(data['2009'], errors='coerce')
#drop all the missing number
clean_2009 = data_2009.dropna()

#read the 2010 column as a numeric data
data_2010 = pd.to_numeric(data['2010'], errors='coerce')
#drop all the missing number
clean_2010 = data_2010.dropna()

#read the 2011 column as a numeric data
data_2011 = pd.to_numeric(data['2011'], errors='coerce')
#drop all the missing number
clean_2011 = data_2011.dropna()

#read the 2012 column as a numeric data
data_2012 = pd.to_numeric(data['2012'], errors='coerce')
#drop all the missing number
clean_2012 = data_2012.dropna()

#below makes the scatterplot
x=1990,1990, 1990, 1990, 1990,1991, 1991,1991, 1991, 1991, 1992, 1992, 1992, 1992, 1992, 1993, 1993, 1993, 1993, 1993, 1994, 1994, 1994, 1994, 1994, 1995, 1995, 1995, 1995, 1995, 1996, 1996, 1996, 1996, 1996, 1997, 1997, 1997, 1997, 1997, 1998, 1998, 1998, 1998, 1998, 1999 ,1999, 1999, 1999, 1999, 2000, 2000, 2000, 2000, 2000, 2001, 2001, 2001, 2001, 2001, 2002, 2002, 2002, 2002, 2002,2003,2003,2003,2003,2003,2004,2004,2004,2004,2004,2005,2005,2005,2005,2005, 2006,2006,2006,2006,2006, 2007,2007,2007,2007,2007,2008,2008,2008,2008,2008,2009,2009,2009,2009,2009,2010,2010, 2010, 2010, 2010, 2011,2011,2011,2011,2011, 2012, 2012,2012,2012,2012
y= clean_1990.head(5), clean_1991.head(5), clean_1992.head(5), clean_1993.head(5),clean_1994.head(5),clean_1995.head(5),clean_1996.head(5),clean_1997.head(5),clean_1998.head(5), clean_1999.head(5), clean_2000.head(5), clean_2001.head(5), clean_2002.head(5), clean_2003.head(5), clean_2004.head(5), clean_2005.head(5), clean_2006.head(5), clean_2007.head(5), clean_2008.head(5), clean_2009.head(5), clean_2010.head(5),clean_2011.head(5),clean_2012.head(5)
plt.xlabel("Year")
plt.ylabel("CO2 Emissions (Tonnes per Person)")
plt.title("CO2 Emissions from 1990 to 2012")
plt.scatter(x,y, color='k')
plt.show()

My Almost Final Code is

#AT Math Unit 2
#pandas, numpy, and matplotlib.pyplot are libraries (add-ons)
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

#this allows you to read the data from set
data = pd.read_csv('co2_emissions.csv', na_values=None, low_memory=False)

#read the 1990 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_1990 = pd.to_numeric(data['1990'], errors='coerce')
#drop all the missing number
clean_1990 = data_1990.dropna()
#read first 5 rows of data
print(clean_1990.head(5))

#read the 1991 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_1991 = pd.to_numeric(data['1991'], errors='coerce')
#drop all the missing number
clean_1991 = data_1991.dropna()
#read first 5 rows of data
print(clean_1991.head(5))

#read the 1992 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_1992 = pd.to_numeric(data['1992'], errors='coerce')
#drop all the missing number
clean_1992 = data_1992.dropna()
#read first 5 rows of data
print(clean_1992.head(5))

#read the 1993 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_1993 = pd.to_numeric(data['1993'], errors='coerce')
#drop all the missing number
clean_1993 = data_1993.dropna()
#read first 5 rows of data
print(clean_1993.head(5))

#read the 1994 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_1994 = pd.to_numeric(data['1994'], errors='coerce')
#drop all the missing number
clean_1994 = data_1994.dropna()
#read first 5 rows of data
print(clean_1994.head(5))

#read the 1995 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_1995 = pd.to_numeric(data['1995'], errors='coerce')
#drop all the missing number
clean_1995 = data_1995.dropna()
#read first 5 rows of data
print(clean_1995.head(5))

#read the 1996 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_1996 = pd.to_numeric(data['1996'], errors='coerce')
#drop all the missing number
clean_1996 = data_1996.dropna()
#read first 5 rows of data
print(clean_1996.head(5))

#read the 1997 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_1997 = pd.to_numeric(data['1997'], errors='coerce')
#drop all the missing number
clean_1997 = data_1997.dropna()
#read first 5 rows of data
print(clean_1997.head(5))

#read the 1998 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_1998 = pd.to_numeric(data['1998'], errors='coerce')
#drop all the missing number
clean_1998 = data_1998.dropna()
#read first 5 rows of data
print(clean_1998.head(5))

#read the 1992 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_1999 = pd.to_numeric(data['1999'], errors='coerce')
#drop all the missing number
clean_1999 = data_1999.dropna()
#read first 5 rows of data
print(clean_1999.head(5))

#read the 2000 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_2000 = pd.to_numeric(data['2000'], errors='coerce')
#drop all the missing number
clean_2000 = data_2000.dropna()
#read first 5 rows of data
print(clean_2000.head(5))

#read the 2001 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_2001 = pd.to_numeric(data['2001'], errors='coerce')
#drop all the missing number
clean_2001 = data_2001.dropna()
#read first 5 rows of data
print(clean_2001.head(5))

#read the 2002 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_2002 = pd.to_numeric(data['2002'], errors='coerce')
#drop all the missing number
clean_2002 = data_2002.dropna()
#read first 5 rows of data
print(clean_2002.head(5))

#read the 2003 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_2003 = pd.to_numeric(data['2003'], errors='coerce')
#drop all the missing number
clean_2003 = data_2003.dropna()
#read first 5 rows of data
print(clean_2003.head(5))

#read the 2004 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_2004 = pd.to_numeric(data['2004'], errors='coerce')
#drop all the missing number
clean_2004 = data_2004.dropna()
#read first 5 rows of data
print(clean_2004.head(5))

#read the 2005 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_2005 = pd.to_numeric(data['2005'], errors='coerce')
#drop all the missing number
clean_2005 = data_2005.dropna()
#read first 5 rows of data
print(clean_2005.head(5))

#read the 2001 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_2006 = pd.to_numeric(data['2006'], errors='coerce')
#drop all the missing number
clean_2006 = data_2006.dropna()
#read first 5 rows of data
print(clean_2006.head(5))

#read the 2007 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_2007 = pd.to_numeric(data['2007'], errors='coerce')
#drop all the missing number
clean_2007 = data_2007.dropna()
#read first 5 rows of data
print(clean_2007.head(5))

#read the 2008 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_2008 = pd.to_numeric(data['2008'], errors='coerce')
#drop all the missing number
clean_2008 = data_2008.dropna()
#read first 5 rows of data
print(clean_2008.head(5))

#read the 2009 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_2009 = pd.to_numeric(data['2009'], errors='coerce')
#drop all the missing number
clean_2009 = data_2009.dropna()
#read first 5 rows of data
print(clean_2009.head(5))

#read the 2010 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_2010 = pd.to_numeric(data['2010'], errors='coerce')
#drop all the missing number
clean_2010 = data_2010.dropna()
#read first 5 rows of data
print(clean_2010.head(5))

#read the 2011 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_2011 = pd.to_numeric(data['2011'], errors='coerce')
#drop all the missing number
clean_2011 = data_2011.dropna()
#read first 5 rows of data
print(clean_2011.head(5))

#read the 2012 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_2012 = pd.to_numeric(data['2012'], errors='coerce')
#drop all the missing number
clean_2012 = data_2012.dropna()
#read first 5 rows of data
print(clean_2012.head(5))

#below makes the scatterplot
x1=1990,1990, 1990, 1990, 1990,1991, 1991,1991, 1991, 1991, 1992, 1992, 1992, 1992, 1992, 1993, 1993, 1993, 1993, 1993, 1994, 1994, 1994, 1994, 1994, 1995, 1995, 1995, 1995, 1995, 1996, 1996, 1996, 1996, 1996, 1997, 1997, 1997, 1997, 1997, 1998, 1998, 1998, 1998, 1998, 1999 ,1999, 1999, 1999, 1999, 2000, 2000, 2000, 2000, 2000, 2001, 2001, 2001, 2001, 2001, 2002, 2002, 2002, 2002, 2002,2003,2003,2003,2003,2003,2004,2004,2004,2004,2004,2005,2005,2005,2005,2005, 2006,2006,2006,2006,2006, 2007,2007,2007,2007,2007,2008,2008,2008,2008,2008,2009,2009,2009,2009,2009,2010,2010, 2010, 2010, 2010, 2011,2011,2011,2011,2011, 2012, 2012,2012,2012,2012
y1= clean_1990.head(5), clean_1991.head(5), clean_1992.head(5), clean_1993.head(5),clean_1994.head(5),clean_1995.head(5),clean_1996.head(5),clean_1997.head(5),clean_1998.head(5), clean_1999.head(5), clean_2000.head(5), clean_2001.head(5), clean_2002.head(5), clean_2003.head(5), clean_2004.head(5), clean_2005.head(5), clean_2006.head(5), clean_2007.head(5), clean_2008.head(5), clean_2009.head(5), clean_2010.head(5),clean_2011.head(5),clean_2012.head(5)
plt.xlabel("Year")
plt.ylabel("CO2 Emissions (Tonnes per Person)")
plt.title("CO2 Emissions from 1990 to 2012")
plt.scatter(x1,y1, color='k')
plt.show()

Reflection:

To get to my final code I went through a series of steps.

I had to find and choose my data set. I talked with Darlene and asked about finding good data, as a result I used Gapminder. I chose to use the set on CO2 emissions because it sounded interesting and it had quite a bit of data.
I had to figure out how to import my data into my spyder. I originally tried to use the code we were taught to make bar graphs with #pandas, numpy, and matplotlib.pyplot are libraries (add-ons)
import pandas
import numpy as np
import matplotlib.pyplot as plt
#this code allows you to read the Python U2P1 file and call it plt
#so that you will be able to access the file easier
data = pandas.read_csv("co2_emissions.csv", low_memory=False)
#the following code allows python to count the distribution of t
print("2012")
response_exp = data["2012"].value_counts(sort=False)
print(response_exp) But it gave me this error UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8f in position 0: invalid start byte after looking up what this meant for a while and having no luck I decided to look up another way of reading a CSV readCSV = csv.reader(csvfile, delimiter=',')
2000 = []
2012 = []
for row in readCSV:
2000 = row[241]
2012 = row[252]
2000.append(2000)
2012.append(2012)
print(2000)
print(2012 This gave me issues as well. At this point I decided to get some help from Darlene by emailing her. While waiting for her response I was still looking for ways to fix the errors I was reseaving , but none of them made sense to me. Eventually she responded and told me that their was a row in my data sheet that was messing things up. She gave me this code
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
data = pd.read_csv('co2emissions.csv', na_values=None, low_memory=False)
#read the 2012 column as a numeric data
#errors='coerce' means that any missing data will be treated as missing number
data_2012 = pd.to_numeric(data['2012'], errors='coerce')

#drop all the missing number
clean_2012 = data_2012.dropna()
#read the first six rows of the numbers
print(clean_2012.head(6))
#stating the five-number summary
desc_2012 = clean_2012.describe()
print(desc_2012)
plt.hist(clean_2012)
plt.title("2012 CO2 emissions")
plt.xlabel("Per Capita C02 emissions") I used this as a starting point for my plot.
I had to figure out how to make a scatter plot. This was very difficult for me. I looked up how to make them in google and I got many results. I decided to use the one that actually made sense which was just putting what x was equal to and what y was equal to then using plt.scatter(x,y) you get a plot. The most challenging part was getting the data into the y without just typing out the numbers. I tried having x=2000 and y=clean_2000, but this gave me an error saying the size of x and y have to be equal. After trying many things I realized if I wanted to get all the data from the year 2000 I would have to type up 2000 around 235 times to be equal with the rows on the spreadsheet. Obviously that would be impractical so I decided to stick with the first five rows of data. I did try ways of getting 2000 to repeat but I haven't figured it out yet.
I had to run my code. I lost track of how many times I ran my code, but it was a lot. Every time I made even a small change I would run it to make sure I didn't mess it up.
I had to fix my code. As you can see in previous steps fixing the code was a constant chore throughout the process.

What to Improve

To make my process more efficiant I need to block out times with specific goals so I feel pressure to get it done in that time. I started doing this in this project on saturday. Another this is knowing good sources to look towards for python advice, because this will limit the amount of time I have to spend on research. In addition I could stick with a way of doing code for a little longer instead of bouncing around to different things and getting myself more confused. I could also not run my code as often, but that could potentially cause me to struggle more looking for the error. Lastly not getting hung up that I can't do something right then because I will learn it eventually it will just take time.

How Would I Move Forwards?

My next step would be learning to color coat countries so it is clear on the scatter plot what is what. Also learning how to take data from the middle of the data set. I still have a lot to learn about coding, but those are two things I am really interested in doing that I couldn't do for this project due to lack of time.

1 / 2

Present

co2_emissions.csv

I updated my AT proposals to be more specific
I found a data set on CO2 emissions from gapminder.org and saved it as a CSV file so that I could use it in spyder
I started to code so that I could read the data in order to make my scatter plot

I first used this code:
#pandas, numpy, and matplotlib.pyplot are libraries (add-ons)
import pandas
import numpy as np
import matplotlib.pyplot as plt
#this code allows you to read the Python U2P1 file and call it plt
#so that you will be able to access the file easier
data = pandas.read_csv("co2_emissions.csv", low_memory=False)
#the following code allows python to count the distribution of t
print("2012")
response_exp = data["2012"].value_counts(sort=False)
print(response_exp) But it gave me this error UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8f in position 0: invalid start byte after looking up what this meant for a while and having no luck I decided to look up another way of reading a CSV
I found this video https://www.youtube.com/watch?v=K_oXb04izZM that showed a way I tried this and I was able to see what was in the data, but this was a long list of numbers

I continued following along with the video and he started talking about only seeing data from certain things which is what I wanted it because the data spans from 1700s to 2013 I only wanted 2000 and 2012 to start. I used the code import csv with open('co2_emissions.csv') as csvfile:
readCSV = csv.reader(csvfile, delimiter=',')
2000 = []
2012 = []
for row in readCSV:
2000 = row[241]
2012 = row[252]
2000.append(2000)
2012.append(2012)
print(2000)
print(2012) this didn't work because of syntax in the append, I am still trying to figure that out because I can't make a graph if I can't read the data.

I emailed Darlene because I wanted to go back to the code I originally used and figure out want the error means and how I can get past it.

I started working with my data today for the analytics paper. I struggled at the beginning because spyder would read and report back the file in table form, but when I told it to report one column it said there were no numbers. This confused me for a while and I tried to rewrite the numbers into a new column and it still didn't work. After about 15 minutes I relized Xcel was adding in commas to the numbers and this was messing up the code. Once I fixed that it was smooth sailing to make the boxplot and histogram.

Final Coding Reflection

September 17 Work Block 10-12

Jan 10 Unit 5 Process

Final Coding Reflection

September 17 Work Block 10-12

Jan 10 Unit 5 Process

Or