Tuesday, December 26, 2017

Deep Learning Skills / Data Science

Programming languages (Python, R, Lua, Scala …) and multiple frameworks and technologies (Tensorflow, Torch, Hadoop, Spark, RDBMS…) to support the modeling requirements

Deep learning, other AI, natural language processing, data mining, information theory, and optimization

Python, R, Lua, Scala, C++

Major deep learning libraries:. TensorFlow, Torch, DeepLearning4J


Distributed system (e.g. Spark, Hadoop, Ignite …)

Big data visualization

Substantial programming experience with almost all of the following: SAS (STAT, macros, EM), R, H2O, Python, SPARK, SQL, other Hadoop. Exposure to GitHub.
Modeling techniques such as linear regression, logistic regression, survival analysis, GLM, tree models (Random Forests and GBM), cluster analysis, principal components, feature creation, and validation. Strong expertise in regularization techniques (Ridge, Lasso, elastic nets), variable selection techniques, feature creation (transformation, binning, high level categorical reduction, etc.) and validation (hold-outs, CV, bootstrap).
Database systems (Oracle, Hadoop, etc.), ETL/data lineage software (Informatica, Talend, AbInitio)

Data visualization (e.g. R Shiny, Spotfire, Tableau)

AWS ecosystem: experience with S3, EC2, EMR, Lambda, Redshift

Data pipelines  Airflow, Luigi, Talend, or AWS Data Pipeline

APIs:  Google, YouTube, Facebook, Twitter, or Oauth

version control (Github, Stash etc.)

Sunday, December 24, 2017

VBA code for loops play

'Option Explicit

Sub Sample()
    Dim i As Long, j As Long, k As Long, l As Long
    Dim CountComb As Long, lastrow As Long

    Range("G2").Value = Now

    Application.ScreenUpdating = False

    CountComb = 0: lastrow = 6

    For i = 1 To 4: For j = 1 To 4
    For k = 1 To 8: For l = 1 To 12
        Range("G" & lastrow).Value = Range("A" & i).Value & "/" & _
                                     Range("B" & j).Value & "/" & _
                                     Range("C" & k).Value & "/" & _
                                     Range("D" & l).Value
        lastrow = lastrow + 1
        CountComb = CountComb + 1
    Next: Next
    Next: Next

    Range("G1").Value = CountComb
    Range("G3").Value = Now

    Application.ScreenUpdating = True
End Sub

Sub Sample2()
    Dim i As Long, j As Long, k As Long, l As Long
    Dim CountComb As Long, lastrow As Long

    Application.ScreenUpdating = False

    CountComb = 0: lastrow = 6

    For i = 1 To 4: For j = 1 To 4
    For k = 1 To 8: For l = 1 To 12
    Cells(i, 20) = i
     Cells(j, 21) = j
      Cells(k, 22) = k
      Cells(l, 23) = l
      lastrow = lastrow + 1
      CountComb = CountComb + 1
      Cells(lastrow, 24) = lastrow
      Cells(CountComb, 25) = CountComb
    Next: Next
    Next: Next

    Application.ScreenUpdating = True
End Sub

Sub Sample3()
    Dim i As Long, j As Long, k As Long, l As Long
    Dim CountComb As Long, lastrow As Long

    Application.ScreenUpdating = False

    CountComb = 0: lastrow = 6

    For i = 1 To 4
    For j = 1 To 4
    For k = 1 To 8
    For l = 1 To 12
    Cells(i, 20) = i
     Cells(j, 21) = j
      Cells(k, 22) = k
      Cells(l, 23) = l
      lastrow = lastrow + 1
      CountComb = CountComb + 1
      Cells(lastrow, 24) = lastrow
      Cells(CountComb, 25) = CountComb
    Next l
    Next k
    Next j
    Next i

    Application.ScreenUpdating = True
End Sub
Sub playarray()

Dim myThirdColumn As Variant

myThirdColumn = Application.Index(myArray, , 3)

End Sub

' https://usefulgyaan.wordpress.com/2013/06/12/vba-trick-of-the-week-slicing-an-array-without-loop-application-index/

Sub Test()

    Dim varArray()          As Variant
    Dim varTemp()           As Variant
Dim myRng As Range

'Application.Index([A1:E10], , 2) = Application.Index(varArray, , 2)

Set myRng = Worksheets("SheetA").Range("A1:E10")
     varArray = myRng.Value
   varTemp = Application.Index(varArray, , 2)
 '  varTemp = Application.Index(varArray, Array(2, 3), 0)
  '  varTemp = Application.Index(varArray, , Application.Transpose(Array(2)))
MsgBox UBound(varTemp) - LBound(varTemp) + 1
    'MsgBox varArray(1, 1)

End Sub

Sub Test2()

    Dim varArray()          As Variant
    Dim varTemp()           As Variant
Dim myRng As Range

'Application.Index([A1:E10], , 2) = Application.Index(varArray, , 2)

Set myRng = Worksheets("SheetA").Range("A1:Z10")
     varArray = myRng.Value
   varTemp = Application.Index(varArray, 3)
    varTemp2 = Application.Index(varArray, , 3)
 '  varTemp = Application.Index(varArray, Array(2, 3), 0)
  '  varTemp = Application.Index(varArray, , Application.Transpose(Array(2)))
'MsgBox UBound(varTemp) - LBound(varTemp) + 1
'MsgBox varArray(1, 1)
'MsgBox UBound(varTemp2) - LBound(varTemp2) + 1
MsgBox varTemp2(10, 1)
' VBA Array starts at 1

End Sub


Sub Test3()

    Dim varArray()          As Variant
    Dim varTemp()           As Variant
Dim myRng As Range

'Application.Index([A1:E10], , 2) = Application.Index(varArray, , 2)

Set myRng = Worksheets("SheetA").Range("A1:Z10")
     varArray = myRng.Value
   varTemp = Application.Index(varArray, Array(1, 2))
    'first two row elements
    'varTemp2 = Application.Index(varArray, , 3)
 '  varTemp = Application.Index(varArray, Array(2, 3), 0)
  '  varTemp = Application.Index(varArray, , Application.Transpose(Array(2)))
'  MsgBox Array(1, 2)(0)
 MsgBox varTemp(1)
 ' the first element actually using array command
 ' the above var temp starts with 1 and not with 0
'MsgBox UBound(varTemp) - LBound(varTemp) + 1
'MsgBox varArray(1, 1)
'MsgBox UBound(varTemp2) - LBound(varTemp2) + 1
'MsgBox varTemp2(10, 1)
' VBA Array starts at 1

End Sub

Thursday, December 14, 2017

Big Data Financial Engineering

Tools and plays

Kafka, Elastic Map Reduce, Avro, Parque, Storm, Hbase

NodejS or Java
- Either:

 Kafka, Storm, Neo4j or Hbase
- Mongoose
- Solr/Lucene

Cassandra, Spark

Deep working experience applying machine learning and statistics to real world problems
Solid understanding of a wide range of data mining / machine learning software packages (e.g., Spark ML, scikit-learn, H2O, Weka, Keras)
Experience with version control systems (git) and comfortable using command-line tools

Knowledge of semantic web technology (e.g., RDF, OWL, SPARQL)
Knowledge of search technologies (e.g., Solr, ElasticSearch)
A link to a portfolio and/or code samples demonstrating your work experience (GitHub, Kaggle, KDD contributions earn major props)

Data Analyst – BI - Training:

Coding data extraction, transformation and loading (ETL) routines.
APIs and databases to pull data together

Hadoop, SQL and NoSQL technologies is required, as well as basic scripting experience in a dynamic language, such as Python or R.
Tools like Jethro, Kyvos, Dremio, AtScale etc.
BI tools like Tableau, Domo, Qlikview etc.
Sata visualization
Relational Databases (eg., Postgres, SQL Server, Oracle, MySQL)
Distributed Databases (eg., Hive, Redshift, Greenplum)
NoSQL Data Frameworks (eg., Spark, Mongo, Cassandra, HBase)
Data Analysis and Transformation (eg., R, Matlab, Python, etc.)

Big Data providers: Cloudera CDH, Hortonworks HDP and Amazon EC2/EMR for deploying and developing large scale solutions.
Hadoop/Spark Big Data Environment Clusters using Foreman, Puppet and Vagrant. Deploy Big Data Platforms (including Hadoop & Spark) to multiple clusters using Cloudera CDH, on both CDH4 and CDH5.
Hadoop MapReduce, YARN, HBase, Spark performance for large-scale data analysis.
Spark performance based on Cloudera and Hortornworks HDP cluster setup in Production Server.
Machine learning data models on Terabytes of data using Spark Ml and Mlib libraries.
 ETL systems using Python, HIVE and Apache spark SQL framework. Storing all the result files in Apache parquet and mapping them to HIVE for Enterprise Datawarehousing.
Real-time data pipelines using Kafka and Python consumers to ingest data through Adobe Real-time Firehorse API into Elastic Search and built real-time dashboards using Kibana.
Aribnb Airflow tool, to run the machine learning scripts in a DAG manner.
Test cases using Python Nose framework.
Scikit learn python scripts to Ml\Mlib spark scripts, which resulted to scalable pipeline framework computing.
Data Pipelines using Spark and Scala on AWS EMR framework and S3.
Real-time Data pipelines using Spark Streaming and Apache Kafka in Python.
Real-time Data pipelines using Apache Storm Java API for processing live streams of data and ingesting to Hbase.
Data pipelines on Cloudera/Hortornworks Hadoop Platform using Apache PIG and automating workflow using Apache Oozie.

Technology: Hadoop Ecosystem /Spring Boot/Microservices/AWS /J2SE/J2EE/Oracle
DBMS/Databases: DB2, My SQL, SQL, PL/SQL
Big Data Ecosystem: HDFS, Map Reduce, Oozie, Hive/Impala, Pig, Sqoop, Zookeeper and Hbase,
Spark, Scala
NOSQL Databases: Mongo DB, Hbase
Version Control Tools: SVN, CVS, VSS, PVCS

Wednesday, September 6, 2017

Advanced Data Science with Python: Machine Learning

Advanced Data Science with Python: Machine Learning

Knowledge of Python programming and basic features of Python
Able to munge, analyze, and visualize data in Python with Pandas and charting


Unit 1: Introduction and Regression

How to dive into Machine Learning
Simple Linear Regression and Multiple Linear Regression
Forward and Backward Selection
Numpy/Scikit-Learn Lab

Class 2:
Part Classification I
Logistic Regression - Application in Default and other variables
Discriminant Analysis
Naive Bayes
Supervised Learning Lab

Resampling and Model Selection

Bootstrap - Breaking it down into simple
Feature Selection
Model Selection and Regularization lab

Class 3:
Classification II
Support Vector Machines SVM
Decision Trees - and Branch Analysis
Bagging and Random Forests
Decision Tree in Python and SVM Lab

Class 4:
Unsupervised Learning - Breaking it down
Principal Component Analysis
Kmeans and Hierarchical Clustering
PCA and Clustering Lab

Recommended Readings

An Introduction to Statistical Learning, by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani
Applied Predictive Modeling, by Max Kuhn and Kjell Johnson
Machine Learning for Hackers, by Drew Conway, John White

R Course Recommended Readings

An Introduction to Statistical Learning with Applications in R, by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani
Applied Predictive Modeling, by Max Kuhn and Kjell Johnson
Data Mining with R, by Luis Torgo
Machine Learning with R, by Brett Lantz


Saturday, September 2, 2017






Creating Backend:


Need to install SQL


Sunday, August 27, 2017

R Code



chicago <- readRDS("chicago.rds")



select(chicago, -(city:dptp))

chicago <- arrange(chicago, date)

head(select(chicago, date, pm25tmean2), 3)

chicago <- rename(chicago, dewpoint = dptp, pm25 = pm25tmean2)

chicago <- mutate(chicago, pm25detrend = pm25 - mean(pm25, na.rm = TRUE))

chicago <- mutate(chicago, year = as.POSIXlt(date)$year + 1900)

years <- group_by(chicago, year)



players <- group_by(Batting, playerID)
games <- summarise(players, total = sum(G))
head(arrange(games, desc(total)), 5)

#for checking group by


mydata = read.csv("sampledata.csv")


mydata2 = select(mydata, Index, State:Y2008)

mydata7 = filter(mydata, Index == "A")

mydata6 = rename(mydata, Index1=Index)

mydata8 = filter(mydata6, Index1 %in% c("A", "C") & Y2002 >= 1300000 )

summarise(mydata, Y2015_mean = mean(Y2015), Y2015_med=median(Y2015))

dt = mydata %>% select(Index, State) %>% sample_n(10)

t = summarise_at(group_by(mydata, Index), vars(Y2011, Y2012), funs(n(), mean(., na.rm = TRUE)))

t = mydata %>% group_by(Index) %>%
  summarise_at(vars(Y2011:Y2015), funs(n(), mean(., na.rm = TRUE)))

t = summarise_at(group_by(mydata, Index), vars(Y2011, Y2012), funs(n(), mean(., na.rm = TRUE)))

t = mydata %>% group_by(Index) %>%
  summarise_at(vars(Y2011:Y2015), funs(n(), mean(., na.rm = TRUE)))

t = mydata %>% filter(Index %in% c("A", "C","I")) %>% group_by(Index) %>%
  do(head( . , 2))
t = mydata %>% filter(Index %in% c("A", "C","I")) %>% group_by(Index) %>%
  do(head( . , 2))

head(mydata, . , 2)


t = mydata %>% select(Index, Y2015) %>%
  filter(Index %in% c("A", "C","I")) %>%
  group_by(Index) %>%
  do(arrange(.,desc(Y2015))) %>%  slice(3)

t = mydata %>% select(Index, Y2015) %>%
  filter(Index %in% c("A", "C","I")) %>%
  group_by(Index) %>%
  do(arrange(.,desc(Y2015))) %>%  slice(3)

t = mydata %>% select(Index, Y2015) %>%
  filter(Index %in% c("A", "C","I")) %>%
  group_by(Index) %>%
  filter(min_rank(desc(Y2015)) == 3)

t = mydata %>%
  summarise(Mean_2014 = mean(Y2014, na.rm=TRUE),
            Mean_2015 = mean(Y2015, na.rm=TRUE)) %>%



flights_sml <- select(flights,
                      air_time )

       gain = arr_delay - dep_delay,
       speed = distance / air_time * 60)

arrange(flights, desc(arr_delay))



df <- tibble(x = c(5, sin(1), NA))

arrange(df, desc(is.na(x)))

x <- flights %>% mutate(travel_time = ifelse((arr_time - dep_time < 0),
                                        2400+(arr_time - dep_time),
                                        arr_time - dep_time)) %>%
  arrange(travel_time) %>% select(arr_time, dep_time, travel_time)


arrange(flights, desc(distance)) %>% select(1:5, distance)

flights %>% select(matches("^dep|^arr_time$|delay$"))


       gain = arr_delay - dep_delay ,
         hours = air_time / 60,
          gain_per_hour = gain / hours

Friday, July 21, 2017

Web Scraping and Content Mining

Web Scraping and Content Mining
Most interesting course in NYC.
2 sessions workshop
Web Scraping is a method for extracting textual characters from websites so that they could be analyzed. Web scraping is sort of content mining, which means that you collect useful information from websites, including quotes, prices, news company info, etc.This method for gathering data is direct, either through looking at websites' html code or visual abstraction techniques using Python programming language.
We start workshop by exploring different methods to gather data from Web. We go through the whole process of gathering, storing and analyzing data. For our examples we use real-life financial quotes and Annual reports 10-K. During the course we learn how to use numerous Python libraries - Urllib, Requests, Wget, BeautifulSoup 4.0, SSL, PDFminer3k, Twitter and others.
Also, we learn to constract Regular expressions patterns to find targeted information on Web pages. As a part of content mining, we build Twitter application to search and analyze the trends.
The price is for two classes:
You will Learn:
BeautifulSoup Python Library
How to use Urllib and Requests
Regular Expressions patterns
Read and analyze PDF files
Store Data with CSV files and SQL Database
Create Twitter app
Build Custom Google Search Engine

Sunday, July 16, 2017

Craiglist Adds

Financial Modeling Tutor $25/hr (Midtown)

I am a former investment fund analyst and experienced in investment banking. 

Offering lessongs on excel, especially build models for companies.

Very affordable rate of $25, and you will be given all the skills needed to land a job at a hedge fund, investment bank, or private equity firm.

Valuation Methods for Companies, putting together models, write-ups, and presentations.

This is a very limited service and is temporarily offered for this month while I am vacation and interested to share what I learned.

Take advantage while you can, hours are limited, availability also at 6 pm - just after your office.

Feel free to contact me if you are interested in learning how to be an finance/excel expert! Thank you!    

Google Sheets and App Script (JavaScript) tutor $25/hr (Manhattan)

I am a tutor for Google App script (Java script) used for automation in Google sheets. If your company is using google sheet learning automation will help you to progress. Also, important for business students.
Charge $25/hour.
Location: All in NYC area.
Please get in touch


GRE / GMAT Quant / Quantitative/ Math Tutor $25/hr

GRE / GMAT Quant Tutor
Experience in tutoring, videos online and good references.
Price $25 per hour
5 years of exp in quantitative GRE GMAT tutoring
From the author of FreeGREGMATClass dot com
Check out the youtube videos contributed by students and tutors for FGGC
First session of 30 minutes is free.   

Quantitative Math Excel VBA $25/hr (New York)

Tutoring Data Science has been my hobby and recreational activity. Many tutoring projects are volunteering and networking oriented (getting new insight). I do it so that I can revise what I do at school and at work. I am an Electrical Engineering graduate, GAARP-FRM certified, PG Dip in Fin analysis and Risk Management, cleared CFA L1, International - MBA (15-16).

I have extensive teaching experience with students of various profiles and backgrounds where I have learned and enhanced my skills and also helped learners. I am also an extremely friendly person. I have taken many online classes as tutor on qcfinance.in

QcFinance.in has tutors available for meetups (mix of onsite at home & online through Wiziq/Skype/Adobe-connect & custom videos support emailed to you for your doubts on holidays)

Subject areas teach includes: Quantitative Methods for GRE / SAT / GMAT, CFA L1, FRM L1, MATLAB/R/SPSS for Quant, Excel-VBA Programming. Quantitiative and Analytics Excel programming & VBA programming.

Sub topics:

Basics of Excel: Vlookups, Hlookups, Index Match, Dependents, Data tables (1 way and 2 ways), Pivots, Charting, Filters, SQL integration, VBA coding, Address and indirect, Offset, Array functions, etc.

Applications: Regression, Histograms, Monte carlo simulation, rank correlation, dashboards, more.
Automation using VBA: Loops (for, do while, case), Recorder, Arrays and Matrices, if else, indexing, etc.

Website: www.qcfinance.in

Playlist of sample quant videos: https://www.youtube.com/playlist?list=PL_-KSXJS5pxOiLjAoe5uAHPAsv-UhIM7i

Keywords: Quant Trainer, Tutor, Trainer, Teacher, Home tuition, GRE, GMAT, Quant, Programming, CFA Level 1, FRM Part 1, Mathematics Tutor, Maths, Onsite.
Keywords: Trainer, Tutor, Home tuition, Excel, VBA, Onsite, trainer, help assignment, video solutions, In Person, 1 on 1, Home Tutor. 

Office Automation on Excel VBA Python SQL R (new york)

We provide official automation services on Excel VBA Python SQL R. 

Automation is the next biggest revolution, legacy methods if not replaced by automation will reduce the productivity of the firm which might even lead to extinction.

Our can reduce a lot of manual work and use lot of Excel Analytics features, our clients have reduced work by upto 50-70% which helped me focus on their product and other value addition to their core business.

Get more hours from your employees and more robust analytical framework!

Please contact me for more details about various processes that we can automate.    

Statistics, Data Science, Machine Learning, Statistical Computing, R   

Tutoring for statistics, machine learning, and data science. The focus includes statistical theory as well as its application, building models. That includes the following: 
•Theory Courses: Probability, Statistical Inference, Bayesian Statistics, Decision Theory, Point Estimation, High-Dimensional Inference, Time Series, and other MS/Ph.D. courses

•Machine Learning: Ridge Regression, LASSO, Basis Pursuit, Supervised/Unsupervised Learning, Neural Network, Statistical Learning Theory 

•Social Science: Causal Inference, Hierarchical Model, Multiple Imputation, Matching

•Statistical computing: R programming, Matlab, STATA, Python, Java, C

Monday, July 3, 2017

Analytics course New York

FINAL PROJECT For the Analytics final project, you will collect, clean and
analyze a data set to solve a real world problem. From this
data, you will segment the data set and perform analysis.
Following your analysis, you will create both a dashboard and
In order for your project to be considered a success, you will
complete the following steps
‣ Identify a problem
‣ Obtain the data
‣ Understand the data
‣ Prepare, clean and format the data
‣ Analyze the data
‣ Create a dashboard to display insights both numerically and
‣ Present high level insights and the resulting actions to key
As you complete elements of your final project, you will be
required to present materials and receive feedback from your
instructional team and classmates as well industry experts.
Our instructors are on hand to validate the feasibility and
manage the scope of your project.
Data Analytics

UNIT 1: DATA IN EXCEL ‣ The Value of Data Lesson 1
‣ Prepare Data in Excel Lesson 2
‣ Clean Data in Excel Lesson 3
‣ Dynamic Data Referencing Lesson 4
‣ Dynamic Data Aggregation Lesson 5
‣ Conditional Formatting and Aggregation Lesson 6
‣ The Value of Databases Lesson 7
‣ Query Large Databases Lesson 8
‣ Data Aggregation in SQL Lesson 9
‣ More Data Aggregation in SQL Lesson 10
‣ Efficient and Dynamic Queries Lesson 11
‣ Present Analysis Results Lesson 12
‣ Statistics to Validate Analysis Lesson 13
‣ Predictive Analysis Lesson 14
‣ Dashboard Design Lesson 15
‣ Track Metrics with Dashboards Lesson 16
‣ Effective Presentations with Data Lesson 17
‣ Flexible Session Lesson 18
‣ Flexible Session Lesson 19
‣ Final Project Presentation Lesson 20

‣ Explain the value of data.
‣ Describe the analytics workflow
‣ Use mean, median, mode to describe data and find outliers
‣ Describe best practices in data cleaning and collection to
ensure the best results from data analysis
‣ Use complex nested logical functions [IF, OR, and AND] to
further manipulate data sets
‣ Manipulate data formats to gain insights on how to analyze
‣ Clean a large messy datasets by removing duplicate rows
and performing text manipulations
‣ Transform and rearrange columns and rows to structure
data for analysis
‣ Manipulate data formats to gain insights on how to analyze
‣ Use data functions [VLOOKUP and HLOOKUP] to
manipulate data sets
‣ Use data functions [INDEX and MATCH] to look up values
in other tables
‣ Reconcile data values by joining and matching
‣ Summarize data using the pivot tables
‣ Use excel aggregation commands [‘Min’, ‘Max’, ‘Sum’,
‘Average’, ‘Count’, ‘Frequency’ to accomplish “count
distinct” ] and their conditional variants [‘COUNTIF’,
‘COUNTBLANKS’] to summarize data sets
‣ Derive insights from data by highlight cells based on
‣ Describe color theory and how it applies to data visualization
Data Analytics
Units Continued

Data Analytics
Units Continued
‣ Use database schema to design appropriate queries
‣ Explain differences between relational databases (tabular
data storage) and document-based databases(key-value
‣ Collect data using standard sql commands [Select, From,
Create, Update, Delete, Truncate, Drop]
‣ - Use advanced SQL commands [where, groupby, having,
orderby, limit] to filter data
‣ - Use joins to create relationships between tables to obtain
‣ - Use SQL boolean operators [AND and OR] and SQL
conditional operators [=,!=,>,<,IN and BETWEEN] to
obtain filtered data
‣ U- Create relationships between tables and data points
including has_many and many_to_many with join tables
using Joins [‘full’, and ‘union’]
‣ - Use sql conditional operators [=,!=,>,<,IN and
BETWEEN] and Null functions[‘is Null’, ‘ is not Null’ and
‘NVL’ ] to create boolean statements
‣ - Use sql mathematical functions [ABS, SIGN, MOD,
FLOOR, CEILING, ROUND, SQRT] to clean data
‣ - Use aggregation commands [‘Min’, ‘Max’, ‘Sum’, ‘Average’,
‘Count’, ‘Count Distinct’] to summarize data sets
‣ - Use aggregation methods to determine trends from data
‣ Use CASE statements to structure data and create new
‣ - Use "WITH AS (" to combine subqueries into one query
‣ - Present analysis results and describe stakeholder
implications and insights
Data Analytics
Units Continued

‣ Provide appropriate context of dataset
‣ Appropriately describe analysis techniques
‣ Present and describe stakeholder implications and insights
‣ Describe the value of descriptive and summary statistics in
understanding a dataset
‣ Create basic statistical measures to better understand the
range, average, and variance within a dataset
‣ Present the most salient statistics in order to provide
context to your audience
‣ Explain the importance of segmentation
‣ Describe the value of inferential statistics and predictive
‣ Review linear regression and Ordinary Least Squares (OLS)
‣ Use sample data to make predictions about a larger
‣ Use scatter plots and bar graphs to visualize data
‣ Apply the best practices to build a dashboard
‣ Demonstrate good visual design without overloading their
dashboard with complexity
‣ Use bubble graphs to visualize data
‣ Apply the best practices to build a dashboard
‣ Contextualize data analysis by creating Tableau dashboards
[includes charts + conditional formatting] with supporting
information specific to the dataset
‣ Display geocoded information in Tableau
‣ Provide real-world context for basis of analysis
‣ Provide localized context for implications of findings
‣ Deliver short, effective presentations
Data Analytics
Units Continued

‣ Focus on a topic selected by the instructor/class in order to
provide deeper insight into a specific area of data analysis
‣ Focus on a topic selected by the instructor/class in order to
provide deeper insight into a specific area of data analysis
‣ Present final project presentation to class

Tuesday, June 27, 2017

SQL:Fundamentals of Querying Course New York Manhattan

SQL:Fundamentals of Querying Course New York Manhattan nyc

Duration: 1 day(s)

Executing a Simple Query
Connect to the SQL Database
Query a Database
Save a Query
Modify a Query
Execute a Saved Query

Performing a Conditional Search
Search Using a Simple Condition
Compare Column Values
Search Using Multiple Conditions
Search for a Range of Values and Null Values
Retrieve Data Based on Patterns

Working with Functions
Perform Date Calculations
Calculate Data Using Aggregate Functions
Manipulate String Values Organizing Data

Sort Data
Rank Data
Group Data
Filter Grouped Data
Summarize Grouped Data
Use PIVOT and UNPIVOT Operators

Retrieving Data from Tables
Combine Results of Two Queries
Compare the Results of Two Queries
Retrieve Data by Joining Tables
Check for Unmatched Records
Retrieve Information from a Single Table Using Joins

Presenting Query Results
Save the Query Result
Generate an XML Report

Appendix A: The OGCBooks Database

After completing this course, students will know how to:
Connect to the SQL Server database and execute a simple query.
Include a search condition in a simple query.
Use various functions to perform calculations on data.
Organize data obtained from a query before it is displayed on-screen.
Retrieve data from tables.
Format an output, save a result, and generate a report.

Sunday, June 25, 2017

Thursday, June 22, 2017

Excel VBA Class in Manahttan

Below are the list of classes I found in Manhattan.

Active - First
421 7th Ave, New York, NY 10001
Phone: 212-537-6125
Address: 421 7th Avenue, 4th Floor

Sam - not small but not that big also
1723 E 12th St, Brooklyn, NY 11229

Moribound 2nd:
545 8th Ave
Suite 1530
New York, NY 10018
Phone: 212-564-2351

Wednesday, June 21, 2017

Quant /Math Tutoring in NYC areas close to Ozone Queens

To be checked:
http://www.flexprep.org/ - Closest

http://www.parkslopetutoring.org/ - down in Brooklyn
http://www.ridgewoodtutors.com/contact/ - Closest
http://pinnacleprep.com/ - Close to Iskon
https://www.ontutoring.com/ -  Close


Tuesday, June 20, 2017

Data Science Programs in New York

Python for Data Science & Machine learning
Course by QcFinance.in

Skills that you will GAIN
  • Python Programming Language
  • Statistical Hypothesis Testing
  • IPython
  • Hypothesis-testing
  • NetworkX
  • Matplotlib
  • Numpy
  • Pandas
  • Scipy
  • Python Lambdas
  • Python Regular Expressions
Python Basics
An introduction to the basic concepts of Python. Learn how to use Python both interactively and through a script. Create your first variables and acquaint yourself with Python's basic data types.
Learn to store, access and manipulate data in lists: the first step towards efficiently working with huge amounts of data.
Functions and Packages
To leverage the code that brilliant Python developers have written, you'll learn about using functions, methods and packages. This will help you to reduce the amount of code you need to solve challenging problems!
NumPy is a Python package to efficiently do data science. Learn to work with the NumPy array, a faster and more powerful alternative to the list, and take your first steps in data exploration.
Course Syllabus
Section 1: Python Basics
Take your first steps in the world of Python. Discover the different data types and create your first variable.
Section 2: Python Lists
Get the know the first way to store many different data points under a single name. Create, subset and manipulate Lists in all sorts of ways.
Section 3: Functions and Packages & Control flow and Pandas
Learn how to get the most out of other people's efforts by importing Python packages and calling functions.
Write conditional constructs to tweak the execution of your scripts and get to know the Pandas DataFrame: the key data structure for Data Science in Python.
Section 4: Numpy and Matplotlib
Write superfast code with Numerical Python, a package to efficiently store and do calculations with huge amounts of data.
Create different types of visualizations depending on the message you want to convey. Learn how to build complex and customized plots based on real data.
Collection of powerful, open-source, tools needed to analyze data and to conduct data science. Specifically, you’ll learn how to use:
  • python
  • jupyter notebooks
  • pandas
  • numpy
  • matplotlib
  • git
  • and many other tools.
We'll cover the machine learning and data mining techniques real employers are looking for, including:
  • Regression analysis
  • K-Means Clustering
  • Principal Component Analysis
  • Train/Test and cross validation
  • Bayesian Methods
  • Decision Trees and Random Forests
  • Multivariate Regression
  • Multi-Level Models
  • Support Vector Machines
  • Reinforcement Learning
  • Collaborative Filtering
  • K-Nearest Neighbor
  • Bias/Variance Tradeoff
  • Ensemble Learning
  • Term Frequency / Inverse Document Frequency
  • Experimental Design and A/B Tests
Statistics and Probability Refresher, and Python
  • Bayes' Theorem
  • Predictive Models
  • Linear Regression
  • Polynomial Regression
  • Multivariate Regression, and Predicting Car Prices
  • Multi-Level Models
  • Machine Learning with Python
  • Supervised vs. Unsupervised Learning, and Train/Test
  • Using Train/Test to Prevent Overfitting a Polynomial Regression
  • Bayesian Methods: Concepts
  • Implementing a Spam Classifier with Naive Bayes
  • K-Means Clustering
  • Clustering people based on income and age
  • Measuring Entropy
  • Install GraphViz
  • Decision Trees: Concepts
  • Decision Trees: Predicting Hiring Decisions
  • Ensemble Learning
  • Support Vector Machines (SVM) Overview
  • Using SVM to cluster people using scikit-learn
  • User-Based Collaborative Filtering
  • Item-Based Collaborative Filtering
  • Finding Movie Similarities
  • Improving the Results of Movie Similarities
  • Making Movie Recommendations to People
  • Improve the recommender's results
  • More Data Mining and Machine Learning Techniques
  • K-Nearest-Neighbors: Concepts
  • Using KNN to predict a rating for a movie
  • Dimensionality Reduction; Principal Component Analysis
  • PCA Example with the Iris data set
  • Data Warehousing Overview: ETL and ELT
  • Reinforcement Learning
  • Dealing with Real-World Data
  • Bias/Variance Tradeoff
  • K-Fold Cross-Validation to avoid overfitting
  • Data Cleaning and Normalization
  • Cleaning web log data
  • Normalizing numerical data
  • Detecting outliers
  • –Apache Spark: Machine Learning on Big Data
  • Installing Spark - Part
  • Spark Introduction
  • Spark and the Resilient Distributed Dataset (RDD)
  • Introducing MLLib
  • Decision Trees in Spark
  • K-Means Clustering in Spark
  • TF / IDF
  • Searching Wikipedia with Spark
  • Using the Spark 2.0 DataFrame API for MLLib
  • Experimental Design
  • A/B Testing Concepts
  • T-Tests and P-Values
  • Hands-on With T-Tests
  • Determining How Long to Run an Experiment
  • A/B Test Gotchas
Please email info@qcfinance.in to know more information.

Some links from online search:

www.skilledup [dot] com/articles/list-data-science-bootcamps


Some general Videos That are suggested: