Isai Roberto Sotarriva Alvarez
Physics Ph.D. | Machine Learning Engineer | Expertise in Statistical Analysis & Data Science
I am a dedicated Physics Ph.D. candidate with expertise in machine learning, data science, and advanced statistical analysis. During my time in Mexico I worked as part of ALICE Mexico on Machine learning and data analysis for event identification. I have since moved to Japan where I have been working as part of the ATLAS collaboration, during my master's I developed a computer vision algorithm for quality control on the new semiconductor detectors. On my Ph. D. I am working on a GNN/RNN method for background removal for the electron and jet backgrounds on the diphoton analysis. I am currently also working as a teaching assistance at Tokyo Institute of Technology. I am passionate about applying computational and analytical skills to solve complex problems, I am seeking roles in machine learning engineering or postdoctoral research in physics.


Early Work: Simulation and Statistical Analysis
During my bachelor's degree, I worked on Monte Carlo simulations using Pythia and presented results at the ALICE Mexico conference.

Bachelor's Thesis: First Use of Machine Learning
For my bachelor's thesis, I used machine learning techniques within the ROOT TMVA framework. The work focused on feature ranking, feature decorrelation, covariance analysis, and systematic comparison of multiple classifiers, including linear discriminants, support vector machines, neural networks, and boosted decision trees. Model performance was evaluated using statistical significance, which was used to define the final working point.

Master's Work: Detector Quality Control and Image Processing
During my master's studies in Japan, I worked on quality control for semiconductor pixel detector modules. This involved image processing tasks such as high-performance image stitching and visual inspection of wire bonding. I focused on traditional computer vision approaches rather than machine learning in order to improve stability, interpretability, and controllability in a production environment.

ATLAS Qualification Task: Scientific Software Development
As part of my ATLAS qualification task, I worked on detector alignment using the ATHENA software framework. This required a detailed understanding of the reconstruction software and adherence to a strict development workflow, including code style requirements, detailed reviews, and iterative revisions before merge approval. This experience provided strong exposure to large, long-lived scientific codebases.

Ph.D. Research: Machine Learning Under Experimental Constraints
At the beginning of my Ph.D., I explored graph neural network approaches to improve the discrimination between diphoton signal events and background from misidentified electrons. This work highlighted practical limitations related to simulation accuracy, domain shift between Monte Carlo and data, and the constraints imposed by analysis approval procedures in large collaborations.

Current Ph.D. Work: Probabilistic Background Modeling
My current thesis work focuses on modeling the Z→ee background in diphoton analyses. The problem can be formulated as an optimal transport task for the invariant mass distribution, combined with a probabilistic model for event yields. Each particle misidentification is treated as an independent stochastic process, allowing both shape and normalization effects to be modeled consistently.

Local LLM + RAG
Early exploration of retrieval-augmented generation using local inference and vector search.

Paper Recommendation System
Vector-based recommendation with user feedback and an interactive graph UI.
Physics PhD background
Experimental HEP researcher within the ATLAS collaboration at the HL-LHC.
ML & Data Science
Optimal transport, GNNs, probabilistic modeling, and retrieval-augmented systems.
Open to opportunities
Seeking ML engineering or postdoc roles where physics meets data-driven insight.