ML Links
ML Links
Created Friday 29 January 2021
https://www.datacamp.com/community/tutorials/customer-life-time-value
https://towardsdatascience.com/better-heatmaps-and-correlation-matrix-plots-in-python-41445d0f2bec
Pipeliling & Tranforming
Linear Regression
- A comprehensive beginners guide for Linear, Ridge and Lasso Regression in Python and R - https://www.analyticsvidhya.com/blog/2017/06/a-comprehensive-guide-for-linear-ridge-and-lasso-regression/?
- Step by Step Assumptions - Linear Regression https://bigdata-madesimple.com/how-to-run-linear-regression-in-python-scikit-learn/ - GOOD Do this
- How do you check the quality of your regression model in Python? - https://towardsdatascience.com/how-do-you-check-the-quality-of-your-regression-model-in-python-fa61759ff685
- The Five Linear Regression Assumptions: Testing on the Kaggle Housing Price Dataset - https://boostedml.com/2018/08/testing-linear-regression-assumptions-the-kaggle-housing-price-dataset.html
- Regression Analysis with Assumptions, Plots & Solutions - https://www.analyticsvidhya.com/blog/2016/07/deeper-regression-analysis-assumptions-plots-solutions/
- Step by Step Assumptions - Linear Regression - https://www.kaggle.com/shrutimechlearn/step-by-step-assumptions-linear-regression
- https://towardsdatascience.com/linear-regression-using-python-b136c91bf0a2
- https://towardsdatascience.com/linear-regression-on-boston-housing-dataset-f409b7e4a155
https://www.statisticshowto.com/goodness-of-fit-test/
https://medium.com/towards-artificial-intelligence/marketing-analytics-part-1-3cf5891b8dbd
https://towardsdatascience.com/how-do-you-check-the-quality-of-your-regression-model-in-python-fa61759ff685
Linearity
- Sns.pairplot
- heatmap - https://www.statsmodels.org/stable/generated/statsmodels.graphics.correlation.plot_corr.html#statsmodels.graphics.correlation.plot_corr
- Residuals vs. predicting variables plots
- Histogram and Q-Q plot of normalized residuals - https://statisticsbyjim.com/regression/model-specification-variable-selection/
- Multicollinearity in Regression Analysis: Problems, Detection, and Solutions https://statisticsbyjim.com/regression/multicollinearity-in-regression-analysis/
- What is Multicollinearity? Here’s Everything You Need to Know - https://www.analyticsvidhya.com/blog/2020/03/what-is-multicollinearity/
How to Transform Data to Better Fit The Normal Distribution https://machinelearningmastery.com/how-to-transform-data-to-fit-the-normal-distribution/
A comprehensive beginners guide for Linear, Ridge and Lasso Regression in Python and R https://www.analyticsvidhya.com/blog/2017/06/a-comprehensive-guide-for-linear-ridge-and-lasso-regression/?
- Polnomial Regression - https://towardsdatascience.com/polynomial-regression-bbe8b9d97491
https://towardsdatascience.com/simulating-replicating-r-regression-plot-in-python-using-sklearn-4ee48a15b67 - QQPlot, ResidualPlot, Homodesicacity plot
OLS
https://statisticsbyjim.com/regression/predictions-regression/
https://statisticsbyjim.com/regression/check-residual-plots-regression-analysis/
https://www.statsmodels.org/devel/examples/notebooks/generated/ols.html
Regression Feature Importance
Recursive Feature Elimination https://machinelearningmastery.com/rfe-feature-selection-in-python/
Use residual plots to check the assumptions of an OLS linear regression model. If you violate the assumptions, you risk producing results that you can’t trust. Residual plots display the residual values on the y-axis and fitted values, or another variable, on the x-axis. After you fit a regression model, it is crucial to check the residual plots. If your plots display unwanted patterns, you can’t trust the regression coefficients and other numeric results.
Residuals must be randomly plotted. If not, Unfortunately, some of the explanatory information has leaked over to the supposedly random error. There are a variety of reasons why a model can have this problem. The possibilities include a missing:
- Independent variable.
- Polynomial term to model a curve.
- Interaction term.
To fix the problem, you need to identify the missing information, variable, or higher-order term and include it in the model. After you correct the problem and refit the model, the residuals should look nice and random
Scenarios
Handling Zero R2
https://www.researchgate.net/post/Why_is_the_Pred_R-squared_equal_to_zero
Visualization
- An Intuitive Guide to Data Visualization in Python https://www.analyticsvidhya.com/blog/2021/02/an-intuitive-guide-to-visualization-in-python/
Pipeline,
Cross Validation and Hyperparameter Tuning
- Comparing different sklearn classifiers - https://www.kaggle.com/meetnaren/comparing-different-sklearn-classifiers
- Comparing Regression Models -
Bagging Classifier
Decision Tree
Standardization & Normalization
https://stats.stackexchange.com/questions/77850/assign-weights-to-variables-in-cluster-analysis
https://www.analyticsvidhya.com/blog/2020/04/feature-scaling-machine-learning-normalization-standardization/
https://machinelearningmastery.com/how-to-improve-neural-network-stability-and-modeling-performance-with-data-scaling/
https://www.kdnuggets.com/2019/04/normalization-vs-standardization-quantitative-analysis.html
https://sebastianraschka.com/Articles/2014_about_feature_scaling.html
https://datascience.stackexchange.com/questions/45900/when-to-use-standard-scaler-and-when-normalizer
Scaling and Normalization
http://www.faqs.org/faqs/ai-faq/neural-nets/part2/section-16.html
KMeans
https://jakevdp.github.io/PythonDataScienceHandbook/05.11-k-means.html
https://stackabuse.com/k-means-clustering-with-scikit-learn/
Seaborn Plots
https://dev.to/thalesbruno/subplotting-with-matplotlib-and-seaborn-5ei8
https://seaborn.pydata.org/examples/grouped_barplot.html
Pandas Groupby - https://realpython.com/pandas-groupby/
Comments
Post a Comment