©2019 Raazesh Sainudiin. Attribution 4.0 International (CC BY 4.0)
Fill in your Personal Number, make sure you pass the # ... Test
cells and
submit by email from your official uu.se
student email account to raazesh.sainudiin
@
math.uu.se
with Subject line YOIYUI001 Assignment 1.
You can submit multiple times before the deadline and your highest score will be used.
# Enter your 12 digit personal number here and evaluate this cell
MyPersonalNumber = 'YYYYMMDDXXXX'
#tests
assert(isinstance(MyPersonalNumber, basestring))
assert(MyPersonalNumber.isdigit())
assert(len(MyPersonalNumber)==12)
Given that you are being introduced to data science it is important to bear in mind the true costs of AI, a highly predictive family of algorithms used in data engineering sciences:
Answer whether each of the following statements is True
or False
according to the authors by appropriately replacing Xxxxx
coresponding to TruthValueOfStatement0a
, TruthValueOfStatement0b
and TruthValueOfStatement0c
, respectively, in the next cell to demonstrate your reading comprehension.
Statement0a =
Each small moment of convenience (provided by Amazon's Echo) – be it answering a question, turning on a light, or playing a song – requires a vast planetary network, fueled by the extraction of non-renewable materials, labor, and data.Statement0b =
The Echo user is simultaneously a consumer, a resource, a worker, and a product Statement0c =
Many of the assumptions about human life made by machine learning systems are narrow, normative and laden with error. Yet they are inscribing and building those assumptions into a new world, and will increasingly play a role in how opportunities, wealth, and knowledge are distributed.# Replace Xxxxx with True or False; Don't modify anything else in this cell!
TruthValueOfStatement0a = Xxxxx
TruthValueOfStatement0b = Xxxxx
TruthValueOfStatement0c = Xxxxx
Evaluate cell below to make sure your answer is valid. You should not modify anything in the cell below when evaluating it to do a local test of your solution. You may need to include and evaluate code snippets from lecture notebooks in cells above to make the local test work correctly sometimes (see error messages for clues). This is meant to help you become efficient at recalling materials covered in lectures that relate to this problem. Such local tests will generally not be available in the exam.
# Test locally to ensure an acceptable answer, True or False
try:
assert(isinstance(TruthValueOfStatement0a, bool))
assert(isinstance(TruthValueOfStatement0b, bool))
assert(isinstance(TruthValueOfStatement0c, bool))
except:
print("Try again. You are not writing True or False for your answers.")
else:
print("Good, you have answered either True or False. Hopefully they are the correct answers!")
You can double-click this cell and start writing your summary below between the two ---
lines in English. When you are done just CTRL-Enter (press down the ctrl
key and hit the Enter
key) to see how it looks in display mode.
Evaluate the following two cells by replacing X
with the right command-line option to wc
command in order to find:
data/earthquakes_small.csv
and data/earthquakes_small.csv
Finally, update the following cell by replacing XXX
with the right integer answers, respectively, for:
NumberOfLinesIn_earthquakes_small_csv_file
and NumberOfCharactersIn_earthquakes_small_csv_file
Here is a brief synopsis of wc
that you would get from running man wc
as follows:
%%sh
man wc
WC(1) BSD General Commands Manual WC(1)
NAME
wc -- word, line, character, and byte count
SYNOPSIS
wc [-clmw] [file ...]
DESCRIPTION
The wc utility displays the number of lines, words, and bytes contained in each input file, or standard input (if no file is specified) to the standard output. A line is defined as a string of characters delimited by a <newline> character. Characters beyond the final <newline> character will not be included in the line count.
A word is defined as a string of characters delimited by white space characters. White space characters are the set of characters for which the iswspace(3) function returns true. If more than one input file is specified, a line of cumulative counts for all the files is displayed on a separate line after the output for the last file.
The following options are available:
-c The number of bytes in each input file is written to the standard output. This will cancel out any prior usage of the -m option.
-l The number of lines in each input file is written to the standard output.
-m The number of characters in each input file is written to the standard output. If the current locale does not support multibyte
characters, this is equivalent to the -c option. This will cancel out any prior usage of the -c option.
-w The number of words in each input file is written to the standard output.
When an option is specified, wc only reports the information requested by that option. The order of output always takes the form of line, word, byte, and file name. The default action is equivalent to specifying the -c, -l and -w options.
%%sh
# replace X in the next line with the right option to find the number of lines
wc -X data/earthquakes_small.csv
%%sh
# replace X in the next line with the right option to find the number of characters
wc -X data/earthquakes_small.csv
# write your answer below by replacing XXX don't modify anything else!
NumberOfLinesIn_earthquakes_small_csv_file = XXX
NumberOfCharactersIn_earthquakes_small_csv_file = XXX
Evaluate cell below to make sure your answer is valid. You should not modify anything in the cell below when evaluating it to do a local test of your solution. You may need to include and evaluate code snippets from lecture notebooks in cells above to make the local test work correctly sometimes (see error messages for clues). This is meant to help you become efficient at recalling materials covered in lectures that relate to this problem. Such local tests will generally not be available in the exam.
# Evaluate this cell locally to make sure you have the answer as a non-negative integer
try:
assert(NumberOfLinesIn_earthquakes_small_csv_file > -1)
print("Good! You have 0 or more lines as your answer. Hopefully it is the correct!")
except AssertionError:
print("Try Again. You seem to not have a valid number of lines as your answer.")
try:
assert(NumberOfCharactersIn_earthquakes_small_csv_file > -1)
print("Good! You have 0 or more characters as your answer. Hopefully it is the correct!")
except AssertionError:
print("Try Again. You seem to not have a valid number of characters as your answer.")
Consider the experiment where we roll two fair dice independently.
Let $D$ be the event that "the sum of the two dice is 8" and let $C$ be the event that "the first die is 2".
What is the probability of D given C, i.e. what is $P(D|C)$?
Do the calculation by hand and write the answer in the next cell by assigning the variable ProbOfDGivenC
.
# Replace XXX below with the correct answer to Assignment 1 Problem 3
# Do NOT change the name of the variable ProbOfDGivenC
ProbOfDGivenC = XXX
Evaluate cell below to make sure your answer is valid. You should not modify anything in the cell below when evaluating it to do a local test of your solution. You may need to include and evaluate code snippets from lecture notebooks in cells above to make the local test work correctly sometimes (see error messages for clues). This is meant to help you become efficient at recalling materials covered in lectures that relate to this problem. Such local tests will generally not be available in the exam.
# test that your answer is indeed a probability by evaluating this cell after you replaced XXX above and evaluated it.
try:
assert(ProbOfDGivenC >= 0 and ProbOfDGivenC <= 1)
print("Your answer is a probability, hopefully it is correct.")
except AssertionError:
print("Try again! and make sure you are actually producing a valid probability, i.e., a real number in [0,1]")
Recall that for a given parameter $\theta \in [0,1]$, the probability mass function (PMF) for the $Bernoulli(\theta)$ RV $X$ is:
$$ \begin{equation} f(x;\theta)= \theta^x (1-\theta)^{1-x} \mathbf{1}_{\{0,1\}}(x) = \begin{cases} \theta & \text{if $x=1$,}\\ 1-\theta & \text{if $x=0$,}\\ 0 & \text{otherwise} \end{cases} \end{equation} $$In the next cell write a function named pmfOfBernoulli
that takes in two arguments:
x
and theta
and returns the value for $f(x; \theta)$.
# Replace RRR...RRR below Do NOT change the name of the function `pmfOfBernoulli`!
def pmfOfBernoulli(x, theta):
'''RRR ... RRR'''
RRR
RRR
RRR
...
RRR
Evaluate cell below to make sure your answer is valid. You should not modify anything in the cell below when evaluating it to do a local test of your solution. You may need to include and evaluate code snippets from lecture notebooks in cells above to make the local test work correctly sometimes (see error messages for clues). This is meant to help you become efficient at recalling materials covered in lectures that relate to this problem. Such local tests will generally not be available in the exam.
# Evaluate this to locally test that your solution is returning probabilities
try:
assert (pmfOfBernoulli(0, 1/2) >=0) and (pmfOfBernoulli(1, 1/2) <=1)
print("You seem to have a valid probability for your answer. Hopefully it is correct!")
except:
print("Try again. You don't have a valid probability,\n \
i.e., a real number in the unit interval [0,1] for your answer")