Machine Learning Note, P4

Underfitting(high bias), Overfitting(high variance), causes because of too many features Two options in addressing Overfitting, Reduce number of features Regularization Regularized Cost Function for Linear Regression And should take care on lambda(regularize parameter) choosing, if it’s too large it’s gonna lead to underfitting Gradient Descent for regularized cost function for linear regression and logistic regression is Normal Equation with regularization is also available, but I didn’t get it. For logistic regression, regularization also applies to Gradient Descent and advanced optimization Andrew smiled: As you’ve learned Linear/Logistic Regression, Gradient Descent, Advanced optimization and Regularization, you probably know more than quite a...…

Machine Learning Note, P3

Logistic Regression is a CLASSIFICATION algorithm Hypothesis of Logistic regression is like this while the g(z) is call Sigmoid function Interpretation of hypothesis, h_theta(x) = estimated probability that y = 1 on input x h_theta(x) = P(y=1 x;theta) so we’ll predict y=1 if h_theta(x) >= 0.5 —that is when thetax >=0, and predict y=0 if h_theta(x) < 0.5 —that is when thetax <0 Decision Boundary, the boundary where theta*x==0, since which our hypothesis will predict differently . It’s a property of the hypothesis and parameters, not the training set. Separate Cost Function of Logistic Regression the cost will go to...…

为什么我倾向于读博士

最近在和思考这个事情, 从年初的怀疑与徘徊,给不了个明确的答案, 到现在终于给自己找了足够的证据向自己证明,读PhD是一件很棒的事情. 虽然我现在也不能下这样的决心,一定要读一个博士, 但倾向性已经很明显,我想读一个PhD,理想的目标是在全球排名前20的CS读PhD. 理由如下: 我所尊敬,并且认为他们很厉害(甚至伟大)的人,不少都是博士学位,比如爱因斯坦先生,李开复先生,身边的比如尹伟,同工作的 Antti & Andrea,等等. 而在现代科学领域,真正推进科学进步的人,博士学位的人也不在少数(比如我近期了解到的田渊栋博士). 这让我想起知乎的一个观点,当你不确定做什么的时候,看一看比你更优秀的人怎么做的. 那个圆,把它的某一个点,推出去一点点(也是从知乎上了解到的).图如下 (图摘自Matt Might的blog,非常推荐这篇) 尽自己全部的力量,将人类认知的边界,推进一点点. 在我看来,是件很酷的事情. 学习是最有效的投资,也是突破阶级边界的最好途径 这一点是我从两个我佩服的人那里学来的,一位是我身边的同学,保送清华研究生,可以参见这里 另一位是把时间当作朋友的作者,李笑来先生.他在书中也说到这个观点 事实上,所有学生家长督促他们”好好学习”,背后隐藏的也是这样的观点. 我从小不明白的问题,”好好学习到底是为了什么?”,”好好上小学是为了上好初中,好好上初中是为了上好高中,好好上高中 是为了考好大学,好好考大学是为了找好工作,那么找到好工作呢?” “学而优则仕”,学习好,就有可能有官当,举家荣耀,鸡犬升天.从平民百姓,到身披朝服,阶级的跃迁. 肤浅一点,学历高,就能找到好工作,就有很多的钱(资源),实现个人(甚至家庭)生活的提升. 当然这些都是表层现象,是很多辛劳父母的梦想,也是我们大多数人的父母能回答的终点. 但背后隐藏的更好的回答是,学习是提高自我,并改变所在阶级的最好途径. 而读PhD则可以保证将我人生当中的学习时间增加,提供更多突破阶级边界和自我边界的可能性. 我对自己现在所在的平台并不满意. 我个人视野的极限,水平的高低,基本是和我的朋友圈子,我所在的平台持平的. 如果我现在停止追求学位,那么即使我在工作岗位上努力工作,接触到更厉害的技术大牛,我认识的最厉害的,发展最好的人, 也极有可能是我在本科时认识的优秀同学. 同理,如果我在UESTC读完硕士研究生之后停止追求学位,那么我认识的最厉害,发展最好的人, 极有可能是我在UESTC读研究生期间认识的优秀同学. 如果能够像我梦想那样申请到PhD,那么我将认识的最厉害,发展最好的人, 也就该是我在追求PhD阶段认识的同学或同事或老师了. 平台决定视野,我对我现在所在的平台并不满意,我希望能拥有更广阔的视野,而一个好的PhD能够给我带来更好的平台. 我已经是家族里学历最高的人了,我想把这个记录保持下去 哈哈,对的. 我已经是父系及母系家族里学历最高的人了,我挺享受的.我想继续把这个记录提高并保持, 我希望能把打破它的难度增加. 我希望能在我的子女20岁的时候,傲娇的说,”Talk to me when you Doctor”,或者”Talk to me when your degree is higher than me.” xD 于2016-08-10晚. …

Machine Learning Note, P2

for Normal Equation,feature scaling is not neccessary. the Pro and Cons of using Gradient Descent and Normal Equation But also, GD works on multiple models like classification, while NE only works well on linear regression. causes of (X_t*X) is non-invertible, redundant features, or some dependent features amount of training_set is less than amount of features I’m left to do the last normal equation in proj ex1 …

Machine Learning Note, P1

Linear Regression, Hypothesis, cost function, Gradient Descent For multiple features linear regression, feature scaling helps to make converge faster, get every feature into approximately a -1<= x(i) <= 1 range or mean normalization ( x[i] - x_mean) / (x_max - x_min) To make sure Gradient Descent work, can run the gradient descent for 100 or some iterations, plot J(theta), the slope should decrease after every iteration. Or could declear convergence if J(theta) decrease less than a very small value in one iteration. sufficiently small alpha, J(theta) should decrease on every iteration, but if alpha is too small, would be slow...…