# machine-learning **Repository Path**: guanwl/machine-learning ## Basic Information - **Project Name**: machine-learning - **Description**: Machine Learning for the Social Scientist - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2019-12-05 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Machine Learning for the Social Scientist 王成军 **社会科学家的机器学习(Machine Learning for the Social Scientist)** 工作坊简介 本工作坊面向社会科学研究者,采用Python介绍机器学习(和深度学习)的基本逻辑(需要学员提前安装Anaconda),主要内容包括三(或四)个部分:1. 机器学习简介:从泰坦尼克号讲起;2. 机器学习初步: 朴素贝叶斯与线性回归;3. 机器学习进阶:支持向量机与随机森林;4. 机器学习扩展:基于Pytorch的神经网络模型(备选)。本工作坊所使用到的Slides、Python代码、阅读文献、案例见 https://github.com/computational-class/machine-learning > Note: 本部分基于python介绍机器学习的基本逻辑和算法,需要学员提前安装Anaconda、熟悉Jupyter notebook的使用、安装pytorch)。 | 序号 | 日期 | 时间 |内容 | 课时数量 | | -------------|:-------------:|:-------------:|:-------------:|-----:| | 1 | 6月19日 周三| 9:00-12:00 新建楼319多功能厅| **机器学习简介**:从泰坦尼克号讲起|3| | 2| 6月20日 周四| 9:00-12:00 新建楼319多功能厅| **机器学习初步**: 朴素贝叶斯与线性回归|3| | 3 | 6月20日 周四| 13:30-16:30 新建楼319多功能厅| **机器学习进阶**: 支持向量机与随机森林|3| | 4 | 6月21日 周五| 9:00-12:00 新建楼319多功能厅| **机器学习扩展**: 基于Pytorch的神经网络模型|3| https://github.com/computational-class/machine-learning ### Slides - [What Is Machine Learning?](https://nbviewer.jupyter.org/format/slides/github/computational-class/machine-learning/blob/master/09.01-What-Is-Machine-Learning.ipynb#/) - [Introducing Scikit-Learn](https://nbviewer.jupyter.org/format/slides/github/computational-class/machine-learning/blob/master/09.02-machine-learning-with-sklearn.ipynb#/) - [Hyperparameters and Model Validation](https://nbviewer.jupyter.org/format/slides/github/computational-class/machine-learning/blob/master/09.03-Hyperparameters-and-Model-Validation.ipynb#/) - [Feature Engineering](https://nbviewer.jupyter.org/format/slides/github/computational-class/machine-learning/blob/master/09.04-Feature-Engineering.ipynb#/) - [In Depth: Naive Bayes Classification](https://nbviewer.jupyter.org/format/slides/github/computational-class/machine-learning/blob/master/09.05-Naive-Bayes.ipynb#/) - [In Depth: Linear Regression](https://nbviewer.jupyter.org/format/slides/github/computational-class/machine-learning/blob/master/09.06-Linear-Regression.ipynb#/) - [In-Depth: Support Vector Machines](https://nbviewer.jupyter.org/format/slides/github/computational-class/machine-learning/blob/master/09.07-Support-Vector-Machines.ipynb#/) - [In-Depth: Decision Trees and Random Forests](https://nbviewer.jupyter.org/format/slides/github/computational-class/machine-learning/blob/master/09.08-Random-Forests.ipynb#/) - [In Depth: Neural Networks 1](https://nbviewer.jupyter.org/format/slides/github/computational-class/machine-learning/blob/master/09.09.neural_network_1.ipynb#/) - [In Depth: Neural Networks 2](https://nbviewer.jupyter.org/format/slides/github/computational-class/machine-learning/blob/master/09.09.neural_network_2.ipynb#/) - [In Depth: Neural Networks 3](https://nbviewer.jupyter.org/format/slides/github/computational-class/machine-learning/blob/master/09.09.neural_network_pytorch.ipynb#/) ## 案例 - 房价预测 https://www.kaggle.com/c/house-prices-advanced-regression-techniques/ - 预测银行用户是否参与定期存款 http://www.dcjingsai.com/common/cmpt/ANZ%20Chengdu%20Data%20Science%20Competition_%E7%AB%9E%E8%B5%9B%E4%BF%A1%E6%81%AF.html?lang=en_US - 游戏玩家的付费预测 http://www.dcjingsai.com/common/cmpt/%E6%B8%B8%E6%88%8F%E7%8E%A9%E5%AE%B6%E4%BB%98%E8%B4%B9%E9%87%91%E9%A2%9D%E9%A2%84%E6%B5%8B%E5%A4%A7%E8%B5%9B_%E7%AB%9E%E8%B5%9B%E4%BF%A1%E6%81%AF.html - 预测假新闻 https://www.kaggle.com/c/fake-news ## 推荐教材 - **Whirlwind Tour Of Python** https://jakevdp.github.io/WhirlwindTourOfPython/ - **Python Data Science Handbook** https://jakevdp.github.io/PythonDataScienceHandbook/ ## 参考书 - **Python for Data Analysis** by Wes McKinney, published by O'Reilly Media https://github.com/data-science-lab/pydata-book - **Introduction to Machine Learning with Python: A Guide for Data Scientist** https://github.com/amueller/introduction_to_ml_with_python. - **Machine Learning in Action** https://github.com/pbharrin/machinelearninginaction & https://github.com/RedstoneWill/MachineLearningInAction-Camp & https://github.com/TingNie/Machine-learning-in-action - 周志华《机器学习》,北京:清华大学出版社,2016. (ISBN 978-7-302-42328-7) - Easley, David and [Jon Kleinberg](http://www.cs.cornell.edu/home/kleinber/). 2011.[Networks, Crowds, and Markets: Reasoning About a Highly Connected World](http://www.cs.cornell.edu/home/kleinber/networks-book/). New York: Cambridge University. 大卫・伊斯利, & 乔恩・克莱因伯格. (2011). [网络、群体与市场](https://www.baidu.com/s?wd=%E7%BD%91%E7%BB%9C%E3%80%81%E7%BE%A4%E4%BD%93%E4%B8%8E%E5%B8%82%E5%9C%BA):揭示高度互联世界的行为原理与效应机制. 清华大学出版社. - Jure Leskovec, Anand Rajaraman, Jeffrey David Ullman.2011. Mining massive datasets (2nd)http://www.mmds.org/ ## 相关课程 - 南京大学《大数据挖掘与分析》课程 https://github.com/computational-class/bigdata - 用Python玩转数据_中国大学MOOC(慕课) http://www.icourse163.org/course/nju-1001571005 - Advanced Machine Learning with scikit-learn, Andreas Müller http://bit.ly/advanced_machine_learning_scikit-learn & https://github.com/computational-class/PythonMachineLearning