Cailiang (Andrew) Xu

Logo

Andrew Xu's Portfolio

View My GitHub Profile

AI & Data Developer

Technical Skills:

A/B Testing, AWS, Azure Machine Learning Studio, C++, Docker, Excel, Git, Kubernetes, Python, Statistical Modelling in R and Python, SQL, Streamlit (Python), Tableau, Rstudio, SAS, Stan, STATA


Education

M.S., Business Analytics
Loyola Marymount University
Aug 2024
B.S., Statistical Science
University of California, Santa Barbara
Dec 2020

Work Experience

Machine Learning Intern @ Sun West Mortgage Company. Inc (May 2024 - Present)
  • - Develop a pioneering machine learning model (Transformer Architecture with Copular Network) from scratch to determine optimal timing for selling loans to investors, increasing decision-making efficiency and accuracy by 5%.
Data Scientist Intern @ Children Hospital, Los Angeles (CHLA) (May 2024 - Aug 2024) Application Preview Application Preview
  • - Increased hospital resource planning efficiency by developing and deploying a predictive model using Python and Azure Machine Learning Studio, raising appointment no-show recall rate by 20% and ensuring better allocation of resources.
  • - Enhanced decision-making capabilities of non-technical clients/managers by creating easy-to-understand dashboards (Tableau) and graphs (Python) to illustrate performance metrics, clearly understanding project progress and enabling informed choices for future steps.
Research Assistant for Dr. Jade Chen @ Loyola Marymount University (Oct 2024 - Present)
  • - Manage and process 800GB+ of data across multiple parquet files, enabling efficient extraction of critical information and accelerating ongoing research efforts.
  • - Transformed a 2GB+ LinkedIn dataset from JSON to CSV using Python, enhanced code readability, maintainability, and scalability to handle large-scale data processing efficiently.
Software Engineer Intern @ Neuroleap (Sep 2021 - May 2022)
  • - Engineered a hand-gesture software embedded with machine learning-based handwriting recognition (Computer Vision) to assist children with motor skill impairments.
  • - Implemented the system across a sample of 100 students, achieving a 15% improvement in writing accuracy and 20% increase in task completion speed over 6 months.
Data Analyst Intern @ China Resources Land (Dec 2019 - Feb 2020)
  • - Constructed an iron supply forecasting model using SQL and Python, analyzing consumption patterns and purchase volumes to increase sales by 1.5%.
  • - Created interactive data visualizations with R and Tableau to illustrate supply-demand dynamics, enhancing stakeholders’ comprehension and facilitating better decision-making processes, resulting in a 10% increase in operational efficiency.

Projects

Advanced NLP System for Pharmaceutical Sentiment Analysis
  • - Architected and implemented a comprehensive NLP pipeline for analyzing sentiment in medication reviews, enhancing drug safety monitoring and patient feedback analysis.
  • - Boosted sentiment classification accuracy by 4.5% by integrating advanced machine learning techniques and custom-tailored BERT models.
  • - Combined traditional NLP methods with state-of-the-art deep learning approaches, processing thousands of user-generated medical comments to extract valuable insights for pharmaceutical companies and healthcare providers.
Video Game Sales and Tableau

Interactive Dashboard Preview

  • - ETL in Python: Efficient extraction and transformation of large datasets.
  • - Star Schema: Designed for faster query performance, improving response time by 10%.
  • - Tableau Dashboards: Created for stakeholders to convey insights from the analysis clearly and interactively.
Spotify Stream Scan
  • - Designed an end-to-end project to retrieve 1GB+ of daily Spotify API data. Predicted genre trends with machine learning algorithms to identify the most suitable genre from over 25 options for upcoming artists to capture trends.
  • - Conducted A/B test with 100 participants. Used Chi-square and T-tests to measure group significance.
  • - Implemented modular programming for scalable genre expansion.
Redfin Real Estate Project
  • - Improved data quality by extracting and refining Redfin real estate data with Excel, Power Query, and SQL, ensuring data integrity through rigorous checks.
  • - Optimized investment decisions using advanced statistics and Tableau, potentially increasing yield by up to 10% per transaction.
Research on the Influencing Factors of GDP via Statistical Modeling
  • - Conducted comprehensive analysis using R Studio and STATA to identify 10+ significant factors influencing GDP, examining correlations with real interest rates and exports, and ensuring statistical significance with P < 0.05.

Awards

2024 LMU Datathon, 1st Place Award Preview Award Preview
  • - Developed a machine learning classifier within 24 hours to predict eligibility for government grants targeting women-owned and minority-owned businesses.
  • - Presented the final solution and secured 1st place with advanced predictive analytics and data modeling techniques.

Contact Me

Click here to get in touch

LinkedIn Logo