Statistics in Data Science
What is Statistics?
- Study of the collection, analysis, interpretation, presentation and organization of Data
- Problem Statement -> Data Analysis -> Informed Business Decisions (Problems Solved)
- Mean, Median, Mode etc. are Statistical formulas
- Statistical Principles:
Data should be normally distributed
Linear Regression: Relation between variables should be linear
- Categories of Statistics
- Descriptive Analytics
- used when we have full data for given population
- Inferential Analytics
- used when there is incomplete data for given population (eg. exit polls)
- used when it is not feasible to examine every member/analyze entire population data
- we study a random sample and describe/make inferences about population
- Statistical Analysis Considerations
-purpose is clear and well-defined
-document questions in advance
-define population of interest (based on purpose)
-determine sample (based on purpose of study)
-sample must be random to represent characteristics of population
Comments
Post a Comment