WORKING FILES

〰️

WORKING FILES 〰️


Apache Spark Training

〰️

Apache Spark Training 〰️

Step 1: Local Laptop Cluster Setup

Teach them Local laptops Cluster Setup… then ask them to do it on another laptop and try

scala-2.12.4

jdk-8u202-windows-x64

Anaconda3-2022.10-Windows-x86_64

https://www.apache.org/dyn/closer.lua/spark/spark-3.3.1/spark-3.3.1-bin-hadoop2.tgz

https://github.com/steveloughran/winutils/archive/refs/heads/master.zip

Environment Variables (windows)

HADOOP_HOME = C:\SPARK\hadoop
JAVA_HOME = C:\Program Files\Java\jdk1.8.0_202
SCALA_HOME = C:\SPARK\scala
SPARK_HOME = C:\SPARK\spark
PYSPARK_PYTHON = C:\Users\user\anaconda3\python.exe
PATH Variables:
%SPARK_HOME%\bin
%HADOOP_HOME%\bin
%SCALA_HOME%\bin

%JAVA_HOME%\bin

Jupyter Environment Variables

PYSPARK_DRIVER_PYTHON = C:\Users\User\anaconda3\Scripts\jupyter.exe
PYSPARK_DRIVER_PYTHON_OPTS = notebook

STep 2: Local Spark Shell

people.json

Step 3: Spark cluster in Google Cloud

wordcount.py

pi.py

Step 4: Spark cluster in AWs Databricks

telecom_churn.csv

Step 5: Spark on colab

How To Start A Spark Session & Read in CSV frrom Website.ipynb

Datacamp Pyspark Cheatsheets

Datacamp - PySpark_RDD_Cheat_Sheet.pdf

Datacamp - PySpark_SQL_Cheat_Sheet.pdf


Tensorflow Training

〰️

Tensorflow Training 〰️


Python Training

〰️

Python Training 〰️


List / Tuples / Dictionary / Sets





Python Cheatsheets


PANDAS Training


Data Cleansing and Wrangling

〰️

Data Cleansing and Wrangling 〰️



Machine Learning Training

〰️

Machine Learning Training 〰️


Confusion Matrix


https://mlr-org.com/

https://www.mltut.com/

https://mlu-explain.github.io/

Learn ML by Shivam Modi.pdf

ML guide with Code by Shivam Modi.pdf

ML Life Cycle by Shivam Modi.pdf

Quick Machine Learning in Python.pdf

ML DL AI Cheat sheet by NIKHIL YADAV.pdf

ML Cheatsheet.pdf

Another ML Cheatsheet.pdf

ML Math.pdf

Machine Learning Infographics Cheatsheet.pdf

ML Cheat sheet by Business Science.io.pdf

the little book of deep learning

Probability Cheatsheet

AI for Everyone notes by Andrew Ng

Dataiku ML Basics.pdf

Lecture Notes on AI.pdf

How to Load the Iris Dataset into Python by Dr. Alvin Ang.ipynb

Various Places to Get Datasets for Machine Learning by Dr Alvin Ang.ipynb

Various Ways of Train Test Splits with Python by Dr Alvin Ang.ipynb

https://machinelearningprojects.net/

ML Cheatsheet by DataCamp.pdf

https://thecleverprogrammer.com/2020/11/15/machine-learning-projects/

30 Python Libraries to Boost Your Data Science Productivity.pdf

https://riverml.xyz/latest/

https://www.obviously.ai/

https://mlflow.org/

https://terencelucasyap.com/predicting-singapore-pools-4d-lottery-winning-numbers-machine-learning/

Scalable Efficient Big Data Pipeline Architecture – Machine Learning for Developers

https://ciml.info/

https://www.akkio.com/beginners-guide-to-machine-learning

ML Terminology - Chris Albon.zip

MLOps

MLOps Explained by Ubuntu.pdf

Guide to MLOPs by Ubuntu.pdf

MLOps for Dummies Databricks.pdf

https://www.dailydoseofds.com/mlops-crash-course-part-1/

Unsupervised Learning

Clustering

Overview of Clustering Methods.pdf

KMeans_using_Python_by_Dr_Alvin.ipynb

Hierarchical_Clustering_using_Python.ipynb

Clustering Cheatsheet by Business Science.io.pdf

PCA

PCA_with_Python.ipynb


Train Test Split


Supervised Learning

Linear / Multiple / Polynomial Regression

Advertising.csv

automobileEDA.csv

Simple Linear Regression with Statsmodel by Dr Alvin Ang.ipynb

Simple Linear Regression using SKLearn by Dr Alvin Ang.ipynb

Multiple Regression using Scikit Learn with Python by Dr Alvin Ang.ipynb (Advertising.csv)

Multiple Regression using Scikit Learn with Python (Part II) by Dr Alvin Ang.ipynb (AutomobileEDA.csv)

Polynomial Regression with Python by Dr Alvin Ang.ipynb

Using Multiple Regression OLS to do Feature Importance Selection (on Iris Dataset) by Dr Alvin Ang.ipynb


Support Vector Machine (SVM)

Understanding SVM using Python by Dr Alvin Ang.ipynb

Simple SVM Applied to Iris Dataset with Python by Dr Alvin Ang.ipynb

Grid, Random and Bayes Search - Hyperparameter Tuning on SVM with Python by Dr Alvin Ang.ipynb

Decision Tree / Random Forest

wine_small.csv

wine.csv

Decision Tree (Classification) on the Iris Flower Dataset using Python by Dr Alvin Ang.ipynb

Using Decision Tree Classifier (DTC) to do Feature Importance Selection (on Iris Dataset) by Dr Alvin Ang.ipynb

Using Random Forest Classifier (RFC) to do Feature Importance Selection (on Iris Dataset) by Dr Alvin Ang.ipynb

Random Forest (Classification) on the Iris Flower Dataset using Python by Dr Alvin Ang.ipynb

Metrics, Normalization and Regularizations

Classification Metrics for ML Models by Dr Alvin Ang.ipynb


Bias / Variance

Understanding Bias vs Variance in Python by Dr. Alvin Ang.ipynb


L1 and L2 Regularization

L1 Lasso and L2 Ridge and Elastic Net Regression using Python by Dr Alvin Ang.ipynb


MinMax and Standard Scaler

Decision Tree (Classification) on the Iris Flower Dataset using Python by Dr Alvin Ang.ipynb


Datacamp ML Cheatsheets


ML4Trading

〰️

ML4Trading 〰️



Technical Analysis


https://eoddata.com/stocklist/SGX.htm (ticker symbol)

https://autochartist.com/

https://ml4trading.io/

https://www.trade-ideas.com/

https://seekingalpha.com/

https://algorithmictrading.substack.com/

https://www.quantfactory.ai/

https://www.quantrocket.com/

https://roic.ai/

https://www.backtrader.com/

https://www.priceactionlab.com/Blog/price-action-lab-software/

https://www.tradeoxy.com/

https://wire.insiderfinance.io/

https://tradologics.com/

https://www.alpharithms.com/

https://www.gurufocus.com/guru/warren%2Bbuffett/summary

https://danelfin.com/

https://wavebasis.com/

https://fractalerts.com/

https://www.benzinga.com/apis/

https://forgeglobal.com/

https://equityzen.com/

https://www.youtube.com/@Algovibes

https://greyhoundanalytics.com/

https://www.fxmagnetic.com

https://www.histdata.com/

https://www.backtestzone.com/

https://million-moves.com/

https://www.quantreo.com/

https://polygon.io/

https://insightsentry.com/

https://www.ssga.com/sg/en/individual

https://www.dividends.sg/

https://sginvestors.io/

https://quickfs.net/

https://singaporeanstocksinvestor.blogspot.com/ (AK)

https://www.dymonasia.com/career/

https://www.xtxmarkets.com/

https://www.360t.com/

https://www.qcp.capital/

https://www.tower-research.com/

https://www.jumptrading.com/

https://drw.com/

https://www.virtu.com/

https://24exchange.com/

https://fairxchange.co.uk/

https://www.bgcg.com/bgc/

https://www.preqin.com/

https://www.okx.com/

https://group.softbank/en

https://www.globalxetfs.com/

https://www.ark-funds.com/

https://eodhd.medium.com/trading-predictions-using-ai-and-python-cdaad4de3447

https://thepythonlab.medium.com/xgboost-in-stock-returns-prediction-using-technical-indicators-and-vix-index-700fda74b425

https://github.com/suparjotamin/stockie

https://medium.com/geekculture/a-simple-way-to-download-financial-data-from-investing-com-in-python-8262271c804f

https://medium.com/trading-data-analysis/metatrader5-python-trading-bot-230bd19285e9


R Training

〰️

R Training 〰️

Files

eisenhower.txt

climate change.txt

https://r-graphics.org/

https://ggplot2-book.org/

https://therinspark.com/

https://r4ds.had.co.nz/index.html

https://rstats.wtf/

https://www.tidytextmining.com/

Datacamp R Cheatsheets

Datacamp - R_Cheat_Sheet.pdf

Datacamp - Working_With_Text_Data_in_R.pdf

Datacamp - Working_with_Dates_and_Time_in_R.pdf

Datacamp - Reshaping_data_with_tidyR_in_R.pdf

Datacamp - ggplot2_cheat_sheet.pdf

Datacamp - data table cheat sheet_R.pdf

Datacamp - Manipulating_Data_in_dplyr_Cheat_Sheet.pdf

Datacamp - Tidyverse_Cheat_Sheet.pdf

Data_Science_With_R_Workflow by Business Science.io.pdf

R Sites

https://togaware.com/projects/rattle/index.html

https://universeofdatascience.com/

https://www.r-bloggers.com/2022/06/the-most-overlooked-r-package-that-can-get-you-through-a-data-science-job-interview/

https://online.stat.psu.edu/statprogram/tutorials/statistical-software/r

https://biostat.app.vumc.org/wiki/Main/RS

https://tuos-bio-data-skills.github.io/intro-stats-book/

https://statsandr.com/

https://www.reneshbedre.com/

https://style.tidyverse.org/

https://cran.r-project.org/web/packages/available_packages_by_name.html

https://r-charts.com/

https://r-graph-gallery.com/

https://posit.co/resources/cheatsheets/

https://yihui.shinyapps.io/formatR/

https://www.kaggle.com/code/rtatman/data-cleaning-challenge-json-txt-and-xls

https://education.rstudio.com/learn/

https://www.business-science.io/finance/2020/02/26/r-for-excel-users.html

https://www.rdocumentation.org/

https://r-packages.io/

https://graphic-walker-data-explorer.netlify.app/

https://github.com/Kanaries/GWalkR


Tableau Training

〰️

Tableau Training 〰️

1st Project


Extras



Tableau Desktop Specialist




Storytelling




Tableau Data Analyst




SQL Training

〰️

SQL Training 〰️


〰️

Power BI Training

〰️ Power BI Training


Resource Allocation with Excel Solver

〰️

Resource Allocation with Excel Solver 〰️


Statistics Training

〰️

Statistics Training 〰️





Degree of Freedom


Statistics Cheatsheets




Excel Training

〰️

Excel Training 〰️


Power Query / Power Pivot

〰️

Power Query / Power Pivot 〰️


Dashboard with Excel

〰️

Dashboard with Excel 〰️


Design of Experiments (DOE)

〰️

Design of Experiments (DOE) 〰️


Flexsim

〰️

Flexsim 〰️


WEKA

〰️

WEKA 〰️


GOOGLE WORKSPACE

〰️

GOOGLE WORKSPACE 〰️


Google Sheets

〰️

Google Sheets 〰️


Looker Studio

〰️

Looker Studio 〰️


Tech with Tim's Facial Recognition Project (on local laptop)

〰️

Tech with Tim's Facial Recognition Project (on local laptop) 〰️


Data Quality Training

〰️

Data Quality Training 〰️


Splunk Tutorial

〰️

Splunk Tutorial 〰️


UCMHP

〰️

UCMHP 〰️


Warehousing & Inventory Management

〰️

Warehousing & Inventory Management 〰️