About
In my academic and professional experience till date, I have been involved working with huge amount of data, creating reports and visualizations for stakeholders. Fascinated by the potential to unearth insights from the vast amounts of data generated every day, I started pursuing masters in Data Science. In the next 5 years, I see myself as an expert data science consultant with understanding of how information can be mined, convert ordinary data to extraordinary insights to help business and individuals with data driven decision making.
![](assets/img/profile-img.jpg)
Data Scientist/Engineer & Data Analyst.
My Motto: Transforming ordinary data to extraordinary insights and converting uncertainties into opportunities.
- Phone: +1 (585) 410 8771
- City: Rochester, New York, USA
- Degree: Master's in Data Science
- Email: sayankrswar@hotmail.com
With experience in health insurance, manufacturing and singal-processing industries, I am proficient in programming languages such as Python and R, and skilled in using data visualization tools such as SQL, Tableau, Power BI and Excel to uncover insights and drive business growth. With a strong foundation in statistics and machine learning, I am well-equipped to identify trends, patterns, and opportunities within data, and to develop predictive models to optimize outcomes and drive efficiency. I would be more than happy to connect and learn more about how I can use your data and make the most sense of it.
Skills
Skills I have gained through professional expereice and academic research/projects
Techstack/Languages
- AGILE Development
- Python
- R
- SQL
- DAX
- C/C#
Machine Learning ![sklearn icon](assets/img_icon/sklearn.png)
- Regressions & Classification Models
- Support Vector Machines
- Bagging/Boosting Models
- Tree Based Models
- Clustering Models
- Time Series Models (ARIMA/Prophet)
Data Management/ETL
- MS SQL Server
- SQL Alchemy/Duck-Db
- PostgreSql
- Sql Server Integration Services (SSIS)
- ETL Pipeline Development
- ETL Automation (Task Scheduler/$Universe)
Statistical analysis ![scipy icon](assets/img_icon/scipy.png)
- Statisticals Distributions
- Hypothesis Tests
- Parametric/Non-Parametric Models
- Generalized Linear Models
- Bayesian Statistics
- Network Statistics
- Network Community Detections
Cloud Services
- AWS Solutions Architect (wip)
- Databricks
- Spark, DeltaLake
- MlLib, MLFlow/ML Registry
- Snaplogic Cloud Services
Data Visualization
- Power-Bi
- Qlikview
- Tableau
- Plotly
- Microsoft Excel
- Seaborn
- Matplotlib
I also have brief work experience in web development methodologies with ASP.Net, ADO.Net, ASP.MVC, Javascripts.
Professional Work Experience
Data Science Engineering Intern
Jun 2023 - Jul 2023
Tapecon, Buffalo, NY
- Leveraged data analytics to drive decision-making by identifying relevant data sources, integrating machine and enterprise resource system data, and constructing a centralized data mart in SQL Server Management Studio.
- Conducted in-depth analysis of manufacturing data using Python, providing insights into production trends, defects, and downtime patterns. Developed visual representations of data findings via Power BI reports, enabling efficient tracking of production key performance indicators and identification of operational gaps.
- Predicted downtime through Cat-Boost algorithm application to modelled data, employing SHAP analysis to identify influential features.
- Estimated material waste via Poisson Generalized Linear model, comparing actual and ideal production figures.
- Conducted statistical analysis (Kruskal-Wallis) on environment data to confirm effect of humidity on product quality.
- Estimated material waste with Poisson Generalized Linear model and created Real-Time PowerBI dashboard to track.
- Implemented Markov chain modeling techniques on production data, creating transition matrices that shed light on machine runtime and downtime patterns.
BI Application Developer
Aug 2019 - Jul 2022
Tata Consultancy Services, Montevideo, Uruguay
- Analyzed large clinical datasets with 250 million+ records, understood business requirements for a healthcare payor, identified key information and designed reporting solutions with PowerBI, QlikView & SSIS.
- Developed Power BI Dashboards for visualization & tracking of COVID-19 cases, determined infection types as per ICD Codes and analyzed by demographics and clinical summaries.
- Designed underlying data model, wrote DAX queries, built reports in Power BI & QlikView to monitor status of critical ETL jobs, SLA expectations & performance deviations to take corrective measures.
- Led team of 5 to design and develop Automated CMS Compliance clinical audit data reports; performed root-cause analysis of data issues/bugs in reports/SQL queries & SSIS packages, deployed fixes.
- Developed SnapLogic ETL pipelines for data migration from on-premises infrastructure to Cloud.
- Developed solutions/visualizations as Process Improvements that saved client 50K USD annually.
- Utilized Excel Macros Coding & PowerBI to develop application for managers to track and debug inconsistencies in employee timesheets data.
ETL/SQL Application Developer
Jan 2017 - Jul 2019
Tata Consultancy Services, Chennai, India
- Created ETL & reporting applications using Microsoft SSIS; Automated jobs with help of Dollar-Universe tool as SQL database & ETL developer for a healthcare account.
- Designed ETL solutions to integrate data from heterogeneous OLTP sources and files to OLAP database while facilitating efficient data storage & complex reporting.
- Developed a .NET MVC web application to search database components within ETL packages by extracting required XML tags (PI that saved 40K USD annually).
- Maintained project status and defect related documents for auditing and managing internal team release.
- Organized multiple events within account (50+ employees) on a weekly basis for employee engagement and greater team collaborations.
- Maintained project status and managed internal team release.
- During training, led a team of 6 members, coded modules such as Login, Customer Account Management Transaction-Handling using C #, MVC architecture and ADO .NET Model, HTML and CSS.
Academic Work Experience
Research Assistant
Aug 2023 - Present
URSeismo, University of Rochester, (visit us)
- Applied Unsupervised Machine Learning and Sequential Data Mining Techniques (Sequencer) to analyze Surface Wave Velocity Profiles across African Continent and categorize regions based on geological compositions contributing to a better understanding of Africa’s Crustal Architecture.
- Constructed Covariance Matrix for pre-arrival seismic noise signals to model noise characteristics for improved receiver function extractions. Utilized a published equation and spicy optimize.curve_fit library to model noise as function of lag time producing parameters closely aligned with noise behaviors at a seismic station.
- Conducted comprehensive metadata analysis of Global Seismic Array of a database encompassing ~2TB data. Presented findings to elucidate data availability trends across different years, networks, and stations, assisting the advisor in a comprehensive global seismic study.
- Developed python scripts to handle preprocessing of millions of seismic records, generation of sac files. Implemented Linux Slurm-Based cluster scripts to parallelize code, significantly reducing preprocessing time from ~4 days to a matter of hours.
- Built Python Anaconda Environment on Linux Based system for the Lab, with necessary seismic data analysis libraries. Participated in lab-meetings, Reading Groups, Presented Papers and Coding workflows.
- Working on Earthquake Detection techniques while exploring Template Matching and Deep Learning techniques for Africa’s TORD, TAM seismic stations.
Data Analyst & Graduate Assistant
Aug 2022 - May 2023
Graduate Education & Post Doctoral Affairs Office, University of Rochester, NY
- Managed and analyzed the applicant database (Slate) for insights into enrollment/admissions.
- Generated comprehensive analytical reports using Tableau for the Graduate Admissions office.
- Engaged in student outreach and responded to queries from prospective students.
- Coordinated and oversaw graduate student activities and events.
Teaching Assistant
Aug 2022 - May 2023
Data Science Department, University of Rochester, NY
- Lead Teaching Assistant for the CSC 261/461 Database Systems Course.
- Managed and coordinated a team of 6 TAs.
- Oversaw the grading of projects and midterms.
- Took Classes, Conducted help sessions and addressed students' conceptual doubts.
- Liaised with the Professor to ensure streamlined execution of tasks and responsibilities.
Projects
Following are the details of few projects I have worked on till date. Hover over for a summary and click if you are interested to know more 🤓
NYC Citibike Demand Forecasting
Report/Code![](assets/img/portfolio/nycbike.png)
Building an end to end machine learning pipeline that can predict the hourly demand of NY Citi-Bikes at a particular station.
URMC Speciality Referral Network Analysis
Report/Code![](assets/img/portfolio/network.png)
Can we visualize the bottlenecks that occur in the specialty referral network so that opportunities for improvement can be further explored?
Economical Data Engineering
Report/Code![](assets/img/portfolio/depipe.png)
Can we build Efficient Data Enginnering Pipelines with in house technology and No Additional Software Overhead Cost?
Unsupervised Learning with Seismic Signals
Report/Code![](assets/img/portfolio/seismic.png)
How can we use Unsupervised Learning & Deep Learning such as Kmeans and Deep Embedded Clustering methods to find meaning in Seismic Signals?
Forecasting Scrap Tier Receipt for ATD
Report/Code![](assets/img/portfolio/atd.png)
Can we inform American Tire Distributors (ATD) about how many scrap tires will be collected by a given Distribution Center every day?
Developing Machine Learning Models From Scratch
Report/Code![](assets/img/portfolio/mlscratch.png)
Is it alright to use Regression, Gradient Boosting, PCA etc. Machine Learning concepts and models without understanding the fundamentals?
Education
University of Rochester
Master in Data Science, Aug 2022 - Dec 2023
Grade: 3.7- Computational Statistics
- Time Series Analytics
- Statistical Machine Learing
- Network Science Analytics
- Spark & Databricks
- Data Mining
Testimonials
Few recommendations received during former professional and academic tenure:
Fun Facts
Love Watching Movies
Sings and Plays Guitar
Volunteers and Supports
Loves Hiking/Travelling
Contact
Feel free to contact me using the information below.
Location:
Rochester, New York
Email:
sayankrswar@hotmail.com
Call:
+1 (585) 410 8771