Thomas J. James Logo

Thomas J. James

Making the Best Better

Academic Work Experience — University of Michigan

Projects developed while completing the University of Michigan's Master in Applied Data Science degree

Role: Data Scientist
Project: Recommender System Engineering — Developing Online-Course Recommendations
Institution: University of Michigan School of Information

  • Develop content-based recommender system to suggest online courses according to occupation title
  • Perform data gathering and preprocessing using Python, ETL, SQL (PostgreSQL), NLP, TensorFlow, and AWS
  • Conduct comparisons amongst word-vectorizers and word-embeddings: GloVe, BERT, RoBERTa, and tf-idf
  • Reach cosine similarity levels of up to 93% for recommended courses

"Thousands of people are interested in professional development courses online, but do not know where to start. Online-course recommendations can provide directional guidance to professionals eager for continuous growth."

Role: Data Scientist
Project: NBA Data Predictions & Analytics Using Machine Learning & Natural Language Processing
Institution: University of Michigan School of Information

  • Contribute teamwork towards successful visualization of Golden State Warriors and Boston Celtics Twitter conversations' potential positive and negative influences
  • Detect influences of Twitter conversations using regex, sentiment-analysis, semantic-analysis, sarcasm-detection, Word2Vec models, latent semantic indexing, latent dirichlet allocation, and topic-coherence models on 8,195 columns of text data
  • Gather and present scalable insights & analytics through data visualizations, observing the increase in conversations, and dominance of criticisms, during the 2022 playoffs

Role: Data Scientist
Project: Predicting Rankings of Marathon Runners
Institution: University of Michigan School of Information

  • Predict rankings of marathon runners using raw datasets, Python, exploratory data analysis, feature engineering, data preprocessing, data analytics, and machine learning algorithms
  • Establish optimal hyper-parameter levels and feature variables for machine-learning rank forecasting to produce predictive model results at a 74% accuracy
  • Forecast marathon ranking predictions developed while reducing inaccuracy percentages from 35% down to 26%

Role: Voluntary Data Scientist
Project: Synthesizing Office Floor-Plans Using Generative Adversarial Networks
Institution: University of Michigan School of Information

  • Assisted an architectural student's project through synthesizing new office floor plan images
  • Created reconstruction of project plans after 45 days to accommodate team adjustments
  • Developed a pix-2-pix GAN model using an undisclosed library of office floor-plan images, AWS, and experimentation with different analytical and statistical approaches
  • Halted project during fine-tuning stages to join a new group and new project

Role: Data Scientist
Project: Predictive Model to Identify Students Who Are At-Risk of Failing a Course
Institution: University of Michigan School of Information

  • Extract and recode the "final result" grade data
  • Perform data wrangling and feature engineering to ensure all sub-set and engineered files are combined into one flat file
  • Apply a weight to the scores to achieve a more accurate data representation
  • Constrain the prediction model to only record up to day 60 of the course
  • Apply Logistic Regression and Random Forest Classifier models to visualize predicted probabilities and identify a probability cut-off for identification of highest-risk students needing extra attention

Role: Data Scientist
Project: Environmental Effects On Housing Prices (Milestone I)
Institution: University of Michigan School of Information

  • Develop data analysis comparing Zillow house prices with LANDSAT satellite imagery and air pollution data
  • Prepare data for analysis using ETL, data cleaning, data preprocessing, EDA, and data visualization
  • Perform exploratory data analysis to determine which feature variables and feature engineering are needed
  • Observe and analyze results statistically through heat maps, histograms, and scatterplots
  • Correlate satellite imagery of vegetation levels with air pollution and house prices
  • Discover 17.5% variance in housing-prices