\documentclass[fleqn]{article}
\usepackage{mydefs}
\usepackage{notes}
\usepackage{url}
\begin{document}
\lecture{Machine Learning}{HW04: Feature and model selection}{CS 689, Spring 2015}
% IF YOU ARE USING THIS .TEX FILE AS A TEMPLATE, PLEASE REPLACE
% "CS 689, Spring 2015" WITH YOUR NAME AND UID.
Hand in via moodle at: \url{https://moodle.umass.edu/course/view.php?id=20836}.
Remember that only PDF submissions are accepted. We encourage using
\LaTeX\ to produce your writeups. See \verb+hw00.tex+ for an example
of how to do so. You can make a \verb+.pdf+ out of the \verb+.tex+ by
running ``\verb+pdflatex hw00.tex+''. You'll need mydefs.sty and notes.sty which can be downloaded from the course page.
\bee
\i For each of \{ centering, variance scaling \} and each of \{ decision trees, kNN, perceptron \}, state
whether the given preprocessing will affect the classifier or not.
% \begin{solution}
%\end{solution}
\i Assume you have D features each generated from a zero mean and unit variance Gaussian distribution. In other words, let $\mathbf{u}$ and $\mathbf{v}$ are two such vectors such that $u_i \sim N(0, 1)$ , $v_i \sim N(0,1)$, show that the quantity
\[
\left[\frac{||u-v||^2}{D}\right] \rightarrow 2 \text{, as } D \rightarrow \infty.
\]
Here $\rightarrow 2$ implies that the value is tightly concentrated around $2$.
(Hint: use law of large numbers\footnote{\url{http://en.wikipedia.org/wiki/Law_of_large_numbers}})
\i For a perceptron you might obtain a confidence value on the prediction by looking at how far the point is from the boundary. How might you obtain confidence values from a decision tree and kNN classifier?
\i Give one reason why 10-fold cross-validation might be preferable over leave-one-out validation?
\ene
\end{document}