Data analysis and Visualization

Top energy consumption and GDP

Gender gap in college degree

Differences in gender for various major over the time from 1970 to 2011. Using Pandas, Matplolib

Earning and college major

Which major gets a high salary job? Which major has the highest unemployment? Is there any difference between gender? Used Pandas, NumPy, Matplotlib, Seaborn.

Ebay used car

Which brand car is expensive? How long it takes to sell the used car? How bad mileage affect the price? Used Pandas, NumPy, Matplotlib, Seaborn

Star Wars survey analysis

Most liked star wars movies. Is there any difference in gender groups? Is there any difference in age groups? Used NumPy, Pandas, Matplotlib, Seaborn

NYC high school set score analysis

Does demographics affect school? Does race/gender affect SAT score? Which schools are the best in terms of SAT score? In there any relation between AP test and SAT test? Does class size affect SAT score. Used NumPy, Pandas, Matplotlib, Seaborn

Employee Exit Survey Analysis

Factors affecting employee dissatisfaction. Age factor for resignation?  Used Pandas, NumPy,  Matplotlib, Seaborn

Wine Quality Dataset

Does certain kind of wine has higher quality? Does sweeter wines receive better ratings? Does wine with higher alcohol content receive better ratings?

Machine Learning

Titanic Survival

Predict survival rate of titanic passengers using Logistic Regression. Used Scikit-learn, Pandas

Social network marketing classifier

Classify whether customer is relevant for advertisement or not using Logistic Regression. Used Scikit-learn, Pandas

Big data analysis using Hadoop, Spark

Hadoop Mini project

Analyzed Stack Exchange question-answer data and  Reddit data using MapReduce in Java and Python

Spark mini project

Analyzed MovieLens dataset, Soccer dataset, Flight schedule dataset using pyspark.

Real-time data analysis using Storm

Storm mini project

Implementation of real time data analysis using storm.