Senior Data Scientist, Data Science Team, AXA France
June 2021 - Present
-
Developed text extraction AI Systems using computer vision to read documents (Driver's license, National ID card, etc.). The models processed over 6M documents in batches using PySpark Jobs
-
Developed a range of reusable python packages for document processing (OCR, object detection, etc.) The packages are designed using the strategy pattern to quickly experiment with different algorithms / models.
-
Automated the indexing of 70% of AXA France's incoming documents by developing a retrainable document classification AI System (exposed as an API).
-
Created a YAML-based templating system to streamline the creation of Azure ML pipelines.
-
Built reusable Azure ML Pipelines (Continuous Training) to automatically retrain AI Systems and publish to Model Registry.
-
Designed Azure DevOps CI/CD/CT pipeline templates to be used in the Team's projects.
-
Led the hiring process, onboarding new team members and facilitating efficient knowledge transfer.
-
Coordinated the annotation process with annotators and SMEs (subject matter experts) to ensure consistent datasets across various projects.
-
Led AXA's 'Python COP' initiative, organizing a series of monthly programming talks and events on topics such as testing, web scraping, packaging, code quality, etc.
-
Teached a series of training sessions/courses on Python, Data Manipulation (pandas & pyspark), Software Engineering best practices for over 30 AXA collaborators.