Summary
Examining and summarizing large data sets to identify trends and help organizations make strategic decisions.
Jots
Related to and often tagged as Data Science. Data analysis is more about using past data to inform current decisions, while data science is trying to predict future outcomes based on available data.
Relevant Categorizations
Should probably break these out into their own pages eventually.
Kinds of data: categorical or numerical, discrete or continuous
Kind | Examples |
---|---|
categorical | gender, location |
numerical | number of customers, active users |
discrete | number of applicants to a job |
continuous | infinite possible outcomes |
Data characteristics:
Characteristic | Explanation |
---|---|
cross-sectional | snapshot of pattern or trend |
time series | test scores, wages over time |
panel data | multiple subjects and multiple points in time |
- dispersion
- how data is organized; high dispersion has a large range
- confidence interval
- a range of values likely to include population value with a certain degree of confidence
- sampling population
- selection of a subset of individuals from within a statistical population, used to estimate characteristics of the whole population
Basic Data Analysis #process
An iterative loop:
- Define your problem
- Disassemble the problems and data into smaller pieces
- Evaluate the problems and data to draw conclusions about what you’ve learned
- Decide on a course of action that solves the problem
Main Concepts for Data Analysis Workflow
- Data Collection
- systematically gathering and measuring information about specific variables for later processing
- Data Cleanup
- inspecting and processing raw data to improve its quality, integrity, and relevance to the problem at hand.
- Data Exploration
- identifying patterns and anomalies in data while examining its structure, and testing hypotheses to confirm or clarify your understanding.
- Statistical Analysis
- transforming data sets into information that can be used for understanding and decision-making.
- Machine Learning
- a subset of Artificial Intelligence that allows an application to discern patterns and automatically improve its analysis of extremely large datasets over time.
Types of Data Analytics
- Descriptive Analytics
- transforming data into more easily understood forms through summarization, organization, and simplification
- Diagnostic Analytics
- examines historical data and connections within its chronology to understand the root cause of observed changes.
- Predictive Analytics
- identifies patterns in existing data which may be used to forecast future outcomes and trends.
- Prescriptive Analytics
- builds on Predictive Analytics to recommend actions based on known parameters, anticipating future outcomes and explaining why those outcomes will take place.
Specializations
- 4 Data Analyst Career Paths: Your Guide to Leveling Up | Coursera
- What Does An Operations Analyst Do | Thinkful
- business analyst
- streamline IT processes, organizational structures, or staff development
- financial analyst
- guide investment opportunities, identify revenue opportunities, and mitigate financial risk
- health care analyst
- use data from health records, cost reports, and patient surveys to help providers improve their quality of care
- market research analyst
- analyze market trends to help determine product and service offerings, price points, and target customers
- operations analyst
- collaborative role working with teams to identify and solve technical, structural, and procedural issues in order to optimize org performance
- systems analyst
- use cost-benefit analysis to help match technological solutions to company needs
Terms
- business intelligence (BI)
- infrastructure to support collection and analysis of business ops data
- business analytics
- turning an org’s raw data into useful information to identify trends, predict outcomes, etc
- data warehouse
- central repository of data integrated from one or more sources
- ETL
- Extract, Transform, Load; munging data from multiple sources into a single set for further processing by multiple processes
- ELT
- Extract, Load, Transfer; grab it and store it, letting consumers transform it as needed when they grab it
- data blending
- munging data from multiple sources into a single set or warehouse for a specific use case
Tools
Related
- BI vs. business analytics: What’s the difference? | Tableau
- What is Data Analytics? A 30,000-Foot Intro to Key Data Analysis Concepts
- AI and Data Scientist Roadmap
- Data Analyst Roadmap
Backlinks
Added to vault 2024-03-15. Updated on 2024-05-06