About

  • I am a software engineer at Instacart working on machine learning infrastructure. I am primarily interested in the machine learning spaces, focusing on infrastructure to support advertising, information retrieval, and other machine learning systems. Prior to joining Instacart I was a Sotware Engineer working on ads delivery at Facebook having been a Postdoctoral Fellow at the University of Waterloo in Canada.

Employment

  • Machine Learning Infrastructure Engineer | Instacart

    Building out the infrastructure to power the machine learning at Instacart. Instacart is built within the AWS ecosystem, so I gained a lot of experience with AWS tools and management/monitoring tools such as Terraform, Datadog etc. During my time at Instacart, I identified many areas for cost saving, including multi-million dollar savings by removing redundant logging.
  • Software Engineer | Meta (fka Facebook)

    My time in Facebook was spent in the Ads & Business Platform organization. I focused on solving advertiser facing issues within the delivery system, including diagnosing systemic inefficiencies in ranking and delivery systems, large scale back-end migrations to unblock scaling of products, and development of new products. Served as an internal hiring point of contact for the team, as well as managing and mentoring interns and being a ramp-up buddy for new engineers, both senior and junior, to both the team and org.
  • Postdoctoral Fellow | University of Waterloo

    Developed a novel anytime, score-safe, document scoring algorithm. Conducted research on reproducibility, and replicability, of information retrieval and machine learning NLP systems. This included software and library versioning, hyper parameter tuning, and hardware level details.

    Included instructing the following courses:
    • CS241: Foundations of Sequential Programming (aka Introduction to Compilers)
  • Assistant Research Fellow | University of Otago

    Investigating the areas within the indexing process that the rest of the system were waiting on, and analysing these areas for potential speed-ups.
  • Lab Demonstrator | University of Otago

    A lab demonstrator has similar responsibilities to a teaching-assistant, helping students with their practical lab work. This demonstrating was undertaken while studying for both MSc (COMP150) and PhD (COSC241, COSC242, COSC244).
    • COMP150: Practical Programming (in Python)
    • COSC241: Programming and Problem Solving
    • COSC242: Algorithms and Data Structures
    • COSC244: Data Communications, Networks and the Internet
  • Research Assistant | University of Otago

    A research assistant is typically employed on a short-term contract in order to assist staff members with an ongoing research project. I have been involved in two of these:
    Relevance and Readability
    Working on incorporating readability metrics into a search engine to re-rank results in order to return readable as well as relevant results. Involved working on a large C++ codebase (ATIRE) worked on by multiple people. Undertaken simultaneously with MSc study.
    Collaborative Filtering Improvement
    Working on improving the predictions made by implementing high level algorithms for my collaborative filtering system I developed as part of my honours degree project.

Education

  • PhD | Computer Science, University of Otago

    Thesis Title: Improved Indexing and Search Throughput

    Investigating various ways to make the process of indexing, and searching, web-scale collections more efficient without impacting the effectiveness of the system. For instance, we can choose to not index certain documents, but if we chose the wrong documents then this could have a significant impact on the effectiveness. My research was performed using the ATIRE open source search engine, this search engine was developed at Otago and I remain involved in its development.
  • MSc (Thesis Only) with Distinction | Computer Science, University of Otago

    Thesis Title: The New User Problem in Collaborative Filtering

    The new user problem is a problem that all collaborative filtering systems must face. How can the system make recommendations for a user when it does not know what that user likes?
  • BSc(Hons) First Class | Computer Science, University of Otago

Publications

Service

  • Information Processing and Management (Elsevier)
    Reviewer
  • Information Processing Letters (Elsevier)
    Reviewer
  • Information Retrieval Journal (Springer)
    Reviewer
  • Transactions on Information Systems (ACM)
    Reviewer
  • Transactions on Knowledge and Data Engineering (IEEE)
    Reviewer
  • SIGIR2021 — 44th International ACM SIGIR Conference on Research and Development in Information Retrieval
    Senior Program Committee [Long Papers]
  • SIGIR2020 — 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
    Senior Program Committee [Long Papers]
    Program Committee [Short Papers]
  • ADCS2019 — 24th Australasian Document Computing Symposium
    Program Committee
  • CIKM2019 — 28th ACM Conference on Information and Knowledge Management
    Program Committee
  • SIGIR2019 — 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval
    Program Committee
  • ADCS2018 — 23rd Australasian Document Computing Symposium
    Program Committee
  • SIGIR2018 — 41st International ACM SIGIR Conference on Research and Development in Information Retrieval
    Program Committee (CoreIR/Search and Ranking Track)
  • ADCS2017 — 22nd Australasian Document Computing Symposium
    Program Committee
  • ICTIR2017 — 3rd ACM International Conference on the Theory of Information Retrieval
    Program Committee
  • ADCS2016 — 21st Australasian Document Computing Symposium
    Program Committee
  • ADCS2015 — 20th Australasian Document Computing Symposium
    Program Committee
  • ADCS2014 — 19th Australasian Document Computing Symposium
    Additional/Sub-Reviewer
  • LIARR Workshop @ SIGIR2017 — Lucene for Information Access and Retrieval Research
    Program Committee
  • RIGOR Workshop @ SIGIR2016 — Reproducibility, Inexplicability, and Generalizability of Results
    Background Machinations – Open-Source IR Reproducibility Challenge

Presentations

  • Questionable Answers in Question Answering
    Presentation for UWaterloo Data System Group Meeting (extended version of NAACL2018)
    Presentation for NAACL2018
  • Query Processing Strategies in Information Retrieval
    Presentation for UWaterloo Data System Group Meeting
  • Rank-at-a-Time Query Processing
    Presentation for ICTIR2016
  • Toward Reproducible Baselines: The Open-Source IR Reproducibility Challenge
    Presentation for ECIR2016
  • Collision Resolution in Hash Tables for Vocabulary Accumulation During Parallel Indexing
    Presentation for ADCS2015
  • Improving Throughput of a Pipeline Model Indexer
    Presentation for ADCS2015
  • The Reproducibility Challenge Overview
    Presentation for the RIGOR Workshop @ SIGIR2015
  • Designing a Hash Function for Information Retrieval
    Presentation for the Computer Science and Information Science Seminars
  • Pipes, Trees and Hash
    Presentation for the Otago Computer Science Systems Group Seminar
  • Diversified Relevance Feedback
    Presentation for the SIGIR2013 Doctoral Consortium
    Re-presented for the Otago Computer Science Postgraduate Symposium
  • Effects of Spam Removal on Search Engine Efficiency and Effectiveness
    Presentation for ADCS2012
  • Diversification in Information Retrieval
    Presentation for Otago Computer Science Postgraduate Symposium
  • The Cold-Start Problem in Collaborative Filtering
    Presentation for Otago Computer Science Postgraduate Symposium

Awards

  • University of Otago Computer Science Department Postgraduate Publishing Prize
  • ACM SIGIR Student Travel Grant
  • ADCS Student Travel Grant

Projects

  • I have been heavily involved in the development of the ATIRE search engine. ATIRE is a research search engine written in C++ and has been demonstrated to be fast at both indexing and searching.
  • I wrote the memcached exporter for the Prometheus monitoring system. This has since been incorporated as a project under the Prometheus Github organization.
  • I did the majority of the background running and checking of scripts for the Reproducibility Challenge.
  • As mainly a learning exercise, I designed and built a community site for the game Hearthstone. The site allowed players to install a client which would upload game logs to a server. These logs would then be parsed to provide an accurate play-by-play web-viewable version of the game. During this exercise I learnt a lot about the Docker and Amazon AWS eco-systems, as well as modern front-end development suites such as React.js.