Test Beam Data Pipeline Pipeline de Datos de Haz de Pruebas
Project Overview
| Role: Developer & Analyst | Duration: 2021-Present | Data Volume: TB-scale per campaign |
The Challenge
Process and analyze TB-scale datasets from particle beam experiments with quick turnaround to guide ongoing data-taking decisions.
Solution
Developed an automated analysis pipeline featuring:
- Real-time monitoring: Data quality checks during beam time
- Parallel processing: Distributed computing on GRID infrastructure
- Automated reporting: Daily summaries and visualizations
- Version control: Reproducible analysis framework

Technical Stack
Python ROOT Pandas NumPy Git Bash HTCondor GRID Computing
Results
- Reduced analysis turnaround from weeks to days
- Reproducible results across different test beam campaigns
- Framework adopted by collaborating institutions
- Enabled quick feedback for detector optimization
Industry Relevance
Skills directly applicable to:
- Data Engineering: Building robust data pipelines
- DevOps: Automation and workflow orchestration
- Scientific Computing: Statistical analysis and visualization
- Distributed Systems: GRID/cluster computing experience