WORKING FILES

〰️

WORKING FILES 〰️


Apache Spark Training

〰️

Apache Spark Training 〰️

Step 1: Local Laptop Cluster Setup

Teach them Local laptops Cluster Setup… then ask them to do it on another laptop and try

scala-2.12.4

jdk-8u202-windows-x64

Anaconda3-2022.10-Windows-x86_64

https://www.apache.org/dyn/closer.lua/spark/spark-3.3.1/spark-3.3.1-bin-hadoop2.tgz

https://github.com/steveloughran/winutils/archive/refs/heads/master.zip

Environment Variables (windows)

HADOOP_HOME = C:\SPARK\hadoop
JAVA_HOME = C:\Program Files\Java\jdk1.8.0_202
SCALA_HOME = C:\SPARK\scala
SPARK_HOME = C:\SPARK\spark
PYSPARK_PYTHON = C:\Users\user\anaconda3\python.exe
PATH Variables:
%SPARK_HOME%\bin
%HADOOP_HOME%\bin
%SCALA_HOME%\bin

%JAVA_HOME%\bin

Jupyter Environment Variables

PYSPARK_DRIVER_PYTHON = C:\Users\User\anaconda3\Scripts\jupyter.exe
PYSPARK_DRIVER_PYTHON_OPTS = notebook

STep 2: Local Spark Shell

people.json

Step 3: Spark cluster in Google Cloud

wordcount.py

pi.py

Step 4: Spark cluster in AWs Databricks

telecom_churn.csv

Step 5: Spark on colab

How To Start A Spark Session & Read in CSV frrom Website.ipynb

Datacamp Pyspark Cheatsheets

Datacamp - PySpark_RDD_Cheat_Sheet.pdf

Datacamp - PySpark_SQL_Cheat_Sheet.pdf


Tensorflow Training

〰️

Tensorflow Training 〰️


Python Training

〰️

Python Training 〰️


List / Tuples / Dictionary / Sets





Python Cheatsheets


PANDAS Training

Statistics

[Descriptive.Stats] + [Seaborn.Visualization] + [Hypothesis.Testing.ANOVA] + [LR.MR.R2]

Day 4 with Dr Alvin.ipynb (Statistics)


Python Visualization

Data Visualization with Python by Dr Alvin Ang.ipynb

Top 7 Python Libraries for Data Visualization.pdf

Python Plotly.pdf

Practical Guide to Matplotlib.pdf


Python Specials

DataPrep + MissingNo by Dr Alvin Ang.ipynb

Ways to Display Json Formats Neatly in Python by Dr Alvin Ang.ipynb

How to Create Random Data with Python by Dr Alvin Ang.ipynb

ProxyCurl_API.ipynb

https://nubela.co/proxycurl/linkdb.html (scrape linkedin)

boston_housing_data.csv

Canada.xlsx

List of Seaborn CSVs

fmri.csv

module_5_auto.csv


https://github.com/xiaohk/stickyland

https://colab.research.google.com/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/Index.ipynb

https://www.kdnuggets.com/2022/12/top-5-nlp-cheat-sheets-beginners-professional.html

AirBnB NYC Data 2019.csv

https://docs.python.org/3/library/functions.html

https://docs.python.org/3/library/stdtypes.html

https://docs.python.org/3/index.html

https://docs.python.org/3/library/index.html

https://fortune.com/education/articles/using-python-for-data-science/

Bad Python in Excel Review

https://www.iesve.com/software/python-scripting


Data Cleansing and Wrangling

〰️

Data Cleansing and Wrangling 〰️



Machine Learning Training

〰️

Machine Learning Training 〰️


Confusion Matrix


MLOps

Unsupervised Learning

Clustering

PCA


Train Test Split


Supervised Learning

Linear / Multiple / Polynomial Regression


Support Vector Machine (SVM)

Decision Tree / Random Forest

Metrics, Normalization and Regularizations


Bias / Variance


L1 and L2 Regularization


MinMax and Standard Scaler


Datacamp ML Cheatsheets


ML4Trading

〰️

ML4Trading 〰️



Technical Analysis



R Training

〰️

R Training 〰️




Tableau Training

〰️

Tableau Training 〰️

1st Project


Extras





Different Map Types in Tableau


2nd Project


Storytelling


Tableau Desktop Specialist







Tableau Data Analyst




SQL Training

〰️

SQL Training 〰️


〰️

Power BI Training

〰️ Power BI Training


Resource Allocation with Excel Solver

〰️

Resource Allocation with Excel Solver 〰️


Statistics Training

〰️

Statistics Training 〰️





Degree of Freedom


Statistics Cheatsheets




Excel Training

〰️

Excel Training 〰️


Power Query / Power Pivot

〰️

Power Query / Power Pivot 〰️


Dashboard with Excel

〰️

Dashboard with Excel 〰️


Design of Experiments (DOE)

〰️

Design of Experiments (DOE) 〰️


Flexsim

〰️

Flexsim 〰️


WEKA

〰️

WEKA 〰️


GOOGLE WORKSPACE

〰️

GOOGLE WORKSPACE 〰️


Google Sheets

〰️

Google Sheets 〰️


Looker Studio

〰️

Looker Studio 〰️


Tech with Tim's Facial Recognition Project (on local laptop)

〰️

Tech with Tim's Facial Recognition Project (on local laptop) 〰️


Data Quality Training

〰️

Data Quality Training 〰️


Splunk Tutorial

〰️

Splunk Tutorial 〰️


UCMHP

〰️

UCMHP 〰️


Warehousing & Inventory Management

〰️

Warehousing & Inventory Management 〰️