Skip to content

Commit 88ba92d

Browse files
committed
Initial commit
0 parents  commit 88ba92d

File tree

13 files changed

+1562
-0
lines changed

13 files changed

+1562
-0
lines changed

.gitignore

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# Yarr! This be what we don't want in our treasure chest (git repository)
2+
3+
# Python
4+
__pycache__/
5+
*.py[cod]
6+
*$py.class
7+
*.so
8+
.Python
9+
build/
10+
develop-eggs/
11+
dist/
12+
downloads/
13+
eggs/
14+
.eggs/
15+
lib/
16+
lib64/
17+
parts/
18+
sdist/
19+
var/
20+
wheels/
21+
pip-wheel-metadata/
22+
share/python-wheels/
23+
*.egg-info/
24+
.installed.cfg
25+
*.egg
26+
MANIFEST
27+
28+
# Virtual environments
29+
venv/
30+
env/
31+
ENV/
32+
env.bak/
33+
venv.bak/
34+
35+
# IDE files
36+
.vscode/
37+
.idea/
38+
*.swp
39+
*.swo
40+
*~
41+
42+
# OS files
43+
.DS_Store
44+
.DS_Store?
45+
._*
46+
.Spotlight-V100
47+
.Trashes
48+
ehthumbs.db
49+
Thumbs.db
50+
51+
# Data files (these be large treasures that shouldn't go in git)
52+
data/*.csv
53+
data/*.json
54+
data/*.xlsx
55+
56+
# Test coverage
57+
.coverage
58+
htmlcov/
59+
.pytest_cache/
60+
61+
# Logs
62+
*.log
63+
logs/
64+
65+
# Environment variables
66+
.env
67+
.env.local
68+
.env.production
69+
70+
# Temporary files
71+
tmp/
72+
temp/

README.md

Lines changed: 217 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,217 @@
1+
# Developer Insights Analytics Dashboard 🏴‍☠️📊
2+
3+
Yarr! Welcome to the Developer Insights Analytics Dashboard, a comprehensive data analysis treasure chest that helps data analysts explore and visualize developer survey data! This be a flexible, full-stack application built with modern data analysis practices in mind.
4+
5+
## 🗺️ Project Structure
6+
7+
```
8+
python-fullstack/
9+
10+
├── .gitignore
11+
├── README.md
12+
├── requirements.txt
13+
14+
├── data/
15+
│ └── kaggle_so_2023/ # Stack Overflow 2023 survey data
16+
│ ├── survey_results_public.csv
17+
│ ├── survey_results_schema.csv
18+
│ └── ...
19+
20+
├── app/
21+
│ ├── __init__.py
22+
│ ├── main.py # Main FastAPI application
23+
│ ├── data_config.py # Data source configuration & analysis
24+
│ └── templates/
25+
│ └── index.html # Analytics dashboard frontend
26+
27+
└── tests/
28+
├── __init__.py
29+
└── test_main.py # Comprehensive test suite
30+
```
31+
32+
## ⚓ Technology Stack
33+
34+
- **Backend:** Python 3.10+ with FastAPI
35+
- **Data Analysis:** Pandas with flexible data source management
36+
- **Web Server:** Uvicorn with auto-reload
37+
- **Frontend:** HTML5, JavaScript (ES6+), Chart.js with interactive controls
38+
- **API Design:** RESTful with Pydantic models and comprehensive error handling
39+
- **Testing:** Pytest with full API coverage
40+
41+
## 🔍 Analytics Features
42+
43+
This application is designed specifically for **data analysts** who need:
44+
45+
### 📊 Flexible Data Analysis
46+
- **Multiple Technology Categories**: Languages, Databases, Platforms, Web Frameworks
47+
- **Configurable Results**: Choose top 10, 15, 20, or 25 results
48+
- **Real-time Analysis**: Interactive dashboard with instant results
49+
- **Comparison Views**: "Have Worked With" vs "Want to Work With" analysis
50+
51+
### 🔌 Extensible Data Sources
52+
- **Modular Design**: Easy to add new data sources
53+
- **Schema Validation**: Built-in data validation and error handling
54+
- **Multiple Format Support**: CSV with automatic schema detection
55+
- **Data Quality Insights**: Response counts and unique technology metrics
56+
57+
## 🏴‍☠️ Setup Instructions
58+
59+
### 1. Data Setup (Already Done!)
60+
61+
The Stack Overflow 2023 survey data is already available in the `data/kaggle_so_2023/` directory with:
62+
- `survey_results_public.csv` - Main survey responses
63+
- `survey_results_schema.csv` - Data schema and column descriptions
64+
- Additional documentation files
65+
66+
### 2. Install Dependencies
67+
68+
Make sure ye have Python 3.10+ installed, then install the required packages:
69+
70+
```bash
71+
# Activate the virtual environment (if ye haven't already)
72+
source venv/bin/activate # On macOS/Linux
73+
# or
74+
venv\\Scripts\\activate # On Windows
75+
76+
# Install the treasure chest of dependencies
77+
pip install -r requirements.txt
78+
```
79+
80+
### 3. Run the Application
81+
82+
Start the FastAPI server like hoisting the main sail:
83+
84+
```bash
85+
# Run the application with auto-reload
86+
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
87+
```
88+
89+
### 4. Access the Analytics Dashboard
90+
91+
Once the server be running, open yer browser and navigate to:
92+
- **Interactive Dashboard:** http://localhost:8000
93+
- **API Documentation:** http://localhost:8000/docs (FastAPI auto-generated)
94+
- **Data Sources API:** http://localhost:8000/api/data-sources
95+
96+
## 🧪 Running Tests
97+
98+
To run the comprehensive test suite:
99+
100+
```bash
101+
# Run all tests with verbose output
102+
pytest -v
103+
104+
# Run tests with coverage report
105+
pytest --cov=app --cov-report=html
106+
107+
# Run specific test categories
108+
pytest tests/test_main.py::test_technology_analysis_endpoint -v
109+
```
110+
111+
## 📊 API Endpoints
112+
113+
### GET `/api/data-sources`
114+
- **Description:** Lists all available data sources and their analysis capabilities
115+
- **Response:** Array of data source information with available columns
116+
117+
### GET `/api/analysis/technology-usage`
118+
- **Description:** Flexible technology usage analysis with multiple parameters
119+
- **Parameters:**
120+
- `source`: Data source name (default: "stackoverflow_2023")
121+
- `column`: Technology category to analyze (default: "LanguageHaveWorkedWith")
122+
- `top_n`: Number of results to return (1-50, default: 10)
123+
- **Response:** Comprehensive analysis results with metadata
124+
125+
### GET `/api/schema/{source_name}`
126+
- **Description:** Returns schema information for a data source
127+
- **Response:** Data structure and column definitions
128+
129+
### GET `/api/languages/popular` (Legacy)
130+
- **Description:** Backward-compatible endpoint for original specification
131+
- **Response:** Top 10 programming languages in legacy format
132+
133+
### GET `/`
134+
- **Description:** Interactive analytics dashboard
135+
- **Response:** Full-featured HTML dashboard with controls
136+
137+
## 🎯 Data Analyst Features
138+
139+
### 🔧 Interactive Analysis Controls
140+
- **Data Source Selection**: Choose from available datasets
141+
- **Technology Categories**: 8+ different analysis dimensions
142+
- Programming Languages (Used/Wanted)
143+
- Databases (Used/Wanted)
144+
- Platforms (Used/Wanted)
145+
- Web Frameworks (Used/Wanted)
146+
- **Result Customization**: Adjustable result counts
147+
- **Real-time Updates**: Instant analysis with loading indicators
148+
149+
### 📈 Rich Visualizations
150+
- **Interactive Bar Charts**: Hover details with percentages
151+
- **Color-coded Categories**: Professional color schemes
152+
- **Responsive Design**: Works on all screen sizes
153+
- **Export Ready**: High-quality charts suitable for presentations
154+
155+
### 📊 Analysis Metadata
156+
- **Response Counts**: Total survey responses analyzed
157+
- **Technology Coverage**: Number of unique technologies found
158+
- **Data Quality**: Insights into data completeness
159+
- **Source Attribution**: Clear data provenance
160+
161+
## 🚀 Future Enhancements for Data Analysts
162+
163+
This application be designed with extensibility in mind! Future versions could include:
164+
165+
### 📊 Advanced Analytics
166+
- **Cross-tabulation Analysis**: Technology combinations and correlations
167+
- **Trend Analysis**: Year-over-year comparisons when historical data is available
168+
- **Demographic Breakdowns**: Analysis by experience level, company size, location
169+
- **Salary Analysis**: Compensation trends by technology stack
170+
171+
### 🔄 Data Pipeline Features
172+
- **Multiple Data Sources**: Support for different survey years and sources
173+
- **Data Refresh Automation**: Scheduled data updates and processing
174+
- **Data Quality Monitoring**: Automated validation and completeness checks
175+
- **Custom Data Uploads**: Allow analysts to upload their own datasets
176+
177+
### 📈 Enhanced Visualizations
178+
- **Multiple Chart Types**: Scatter plots, heatmaps, time series
179+
- **Interactive Filtering**: Dynamic data exploration with multiple dimensions
180+
- **Export Capabilities**: PDF reports, CSV exports, chart images
181+
- **Dashboard Customization**: Save and share custom analysis configurations
182+
183+
### 🔒 Enterprise Features
184+
- **User Authentication**: Multi-user support with role-based access
185+
- **API Rate Limiting**: Production-ready API with proper throttling
186+
- **Database Integration**: PostgreSQL/MongoDB for larger datasets
187+
- **Caching Layer**: Redis for improved performance with large datasets
188+
189+
## 👥 For Data Analysts
190+
191+
This application follows data analysis best practices:
192+
193+
- **Reproducible Analysis**: All analysis parameters are configurable and documented
194+
- **Data Validation**: Built-in checks for data quality and completeness
195+
- **Error Handling**: Graceful handling of missing data and edge cases
196+
- **Performance Optimization**: Efficient data processing for large datasets
197+
- **API-First Design**: Easy integration with other analysis tools and notebooks
198+
- **Comprehensive Testing**: Full test coverage ensures reliability
199+
200+
## 🏴‍☠️ Development Notes
201+
202+
- **Modular Architecture**: Easy to extend with new data sources and analysis types
203+
- **Clean Code Principles**: Well-documented, maintainable codebase
204+
- **Type Safety**: Pydantic models for API contract enforcement
205+
- **Async Support**: Built for high-performance concurrent requests
206+
- **Docker Ready**: Easy containerization for deployment
207+
- **All code be commented in proper pirate fashion, yarr!**
208+
209+
## 📝 License
210+
211+
This treasure be open source - use it freely for yer data analysis adventures, but remember to give credit where it be due!
212+
213+
---
214+
215+
*Built with ❤️ and ⚓ by data analyst pirates who love clean code, robust analysis, and beautiful visualizations*
216+
217+
**Perfect for:** Data analysts, researchers, survey data exploration, technology trend analysis, and learning modern full-stack development with a focus on data science applications.

app/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# Yarr! This be the main app package, matey!

0 commit comments

Comments
 (0)