Friday December 29th
Data Continued…
This is really quite fun!
-
I’m really enjoying the Data Science work. It’s challenging, rewarding, but it also makes sense.
-
Today, I analyzed some wine data
We started out with two csv files
-
They were initially separated by semi-colons, so had to fix that!
-
There were two separate data files with similar columns; red and white. We were tasked with creating a column called “color” that would also give the color, and then we were to merge these two data-sets together, which makes sense.
-
Observed the data, created a white and red column for each dataset and merged to two in a new edited file containing both datasets
-
I verified that both datasets were, in fact, appended by looking at the head and tail of the dataset
I made a mistake!
- On the first try, I accidentally set the header to false, so the edited file had no Header for the Columns. Yikes!
- Fixed it! That meant updating the edited file! I also had to rename a column before I could combine it into the one file. I was silly and didn’t discover that it had created a NaN column of values from the old one, so I had to delete that column, which brought me back to the 13 columns of data, properly named.
- So far, so good!
Visualizations
- Learned how to do some simple diagrams / visualization using Seaborn
Common Functions
-
This is using a different data-set that is associated with EPA data and carbon emissions.
-
Checking for non-null values
- Checking for dupes
What I like so far
-
The cells in Jupyter are great in that they allow you to focus on single, specified tasks rather than looking at pages of an intimidating code-base.
-
It also is quite functional; you’re chaining functions together
(eg. .sum().mean()).
Haven’t written a singleself.ihatemylife
yet :D
To do
- Finish up chapter, which includes a lot of SQL and more Case Studies
- Finish up project 1, project 2
- Finish up Data Analysis coursework from other shorter course (project 3 and 4)
- Finish up application
- Prepare for interviews (I have two on 1/1)! bites nails
Katas
- find consecutive pairs (tuples) given a list :
eg
[1,2,3,4,5] == [(1,2), (3,4), 5]; count = 2
(Python)
def pairs(ar):
arr = zip(ar[0::2], ar[1::2])
count = 0
for i in arr:
if abs(i[0] - i[1]) == 1:
count = count + 1
return count
- if number is multiple of index, return (JavaScript)
function multipleOfIndex(array) {
var arr = []
for (var i = 0; i < array.length; i++)
{
if (array[i] % i == 0)
{
arr.push(array[i])
}
}
return arr;
}