Will gain knowledge in the basic concepts of Data Analysis
To acquire skills in data preparatory and preprocessing steps
To understand the mathematical skills in statistics
To learn the tools and packages in Python for data science
To gain understanding in classification and Regression Model
To acquire knowledge in data interpretation and visualization techniques
Need for data science – benefits and uses – facets of data – data science process – setting the research goal – retrieving data – cleansing, integrating, and transforming data – exploratory data analysis – build the models – presenting and building applications
Frequency distributions – Outliers – relative frequency distributions – cumulative frequency distributions – frequency distributions for nominal data – interpreting distributions – graphs –averages – mode – median – mean – averages for qualitative and ranked data – describing variability – range – variance – standard deviation – degrees of freedom – interquartile range – variability for qualitative and ranked data
Basics of Numpy arrays – aggregations – computations on arrays – comparisons, masks, boolean logic – fancy indexing – structured arrays – Data manipulation with Pandas – data indexing and selection – operating on data – missing data – hierarchical indexing – combining datasets – aggregation and grouping – pivot tables
Normal distributions – z scores – normal curve problems – finding proportions – finding scores –more about z scores – correlation – scatter plots – correlation coefficient for quantitative data –computational formula for correlation coefficient – regression – regression line – least squares regression line – standard error of estimate – interpretation of r2 – multiple regression equations –regression toward the mean
Visualization with matplotlib – line plots – scatter plots – visualizing errors – density and contour plots – histograms, binnings, and density – three dimensional plotting – geographic data – data analysis using statmodels and seaborn – graph plotting using Plotly – interactive data visualization using Bokeh
At the end of the course, the students should be able to:
Apply the skills of data inspecting and cleansing.
Determine the relationship between data dependencies using statistics
Can handle data using primary tools used for data science in Python
Represent the useful information using mathematical skills
Can apply the knowledge for data describing and visualization using tools.
0 Comments