PhD — Computer Science, University of Otago, Mid 2012–End 2015 (Conferred May 2016)
Thesis Title: Improved Indexing & Search Throughput.
Investigating various ways to make the process of indexing, and searching, web-scale collections more efficient without impacting the effectiveness of the system. For instance, we can choose to not index certain documents, but if we chose the wrong documents then this could have a significant impact on the effectiveness. My research was performed using the ATIRE open source search engine, this search engine was developed at Otago and I remain actively involved in its development.
During my candidature I was, and continue to be, an active member of the information retrieval community, having attended and presented at multiple conferences and workshops. Including SIGIR, CIKM, and ADCS.
My original topic was to do with relevance feedback and diversification, their both improving results while appearing to be performing directly opposed operations.
MSc (Thesis Only) with Distinction — Computer Science, University of Otago, Late 2009–Early 2011
Thesis Title: The New User Problem in Collaborative Filtering.
The new user problem is a problem that all collaborative filtering systems must face. How can the system make recommendations for a user when it does not know what that user likes?
I developed new methods for a collaborative filtering system to choose which items to present to a new user for them to rate. This technique of presenting items to the user forms a method by which the new-user problem can be alleviated for collaborative filtering systems.
Developed new metrics for evaluation of, and comparison between, the different methods a system might choose to select the next item.
BSc(Hons) First Class — Computer Science, University of Otago, 2005–2008
For my honours research project I worked on the Netflix Prize — an exercise in collaborative filtering, machine learning and data mining. Honours projects are year long projects designed to test research ability.
An Exploration of Serverless Architectures for Information Retrieval
Matt Crane, Jimmy Lin
ICTIR 2017, to appear, 2017
Quantization in Append-Only Collections
Salman Mohammed, Matt Crane, Jimmy Lin
ICTIR 2017, to appear, 2017
An Exploration of Approaches to Integrating Neural Reranking Models in Multi-Stage Ranking Architectures
Zhucheng Tu, Matt Crane, Royal Sequiera, Junchen Zhang, Jimmy Lin
NeuIR Workshop @ SIGIR 2017, published as arXiv:1707.08275, 2017
The Lucene for Information Access and Retrieval Research (LIARR) Workshop at SIGIR 2017
Leif Azzopardi, Matt Crane, Hui Fang, Grant Ingersoll, Jimmy Lin, Yashar Moshfeghi, Harrisen Scells, Peilin Yang, Guido Zuccon
SIGIR 2017, pages 1429–1430, 2017
Efficient and Effective Tail Latency Minimization in Multi-Stage Retrieval Systems
Joel Mackenzie, J. Shane Culpepper, Roi Blanco, Matt Crane, Charles L. A. Clarke, Jimmy Lin
A Comparison of Document-at-a-Time and Score-at-a-Time Processing
Matt Crane, J. Shane Culpepper, Jimmy Lin, Joel Mackenzie, Andrew Trotman
WSDM 2017, pages 201–210, 2017
Rank-at-a-Time Query Processing
Ahmed Elbagoury, Matt Crane, Jimmy Lin
ICTIR 2016, pages 229–232, 2016
Toward Reproducible Baselines: The Open-Source IR Reproducibility Challenge
Jimmy Lin, Matt Crane, Andrew Trotman, Jamie Callan, Ishan Chattopadhyaya, John Foley, Grant Ingersoll, Craig Macdonald, Sebastiano Vigna
ECIR 2016, pages 408–420, 2016
Improved Indexing & Search Throughput
PhD Thesis, 2016
Report on the SIGIR 2015 Workshop on Reproducibility, Inexplicability, and Generalizability of Results (RIGOR)
Jaime Arguello, Matt Crane, Fernando Diaz, Jimmy Lin, Andrew Trotman
ACM SIGIR Forum 49 (2), pages 107–116, 2015
Collision Resolution in Hash Tables for Vocabulary Accumulation During Parallel Indexing
Matt Crane, Andrew Trotman
ADCS 2015, pages 1–4, 2015
Improving Throughput of a Pipeline Model Indexer
Matt Crane, Andrew Trotman, David Eyers
ADCS 2015, 5–8, 2015
Malformed UTF-8 and Spam
Matt Crane, Andrew Trotman, Richard O'Keefe
ADCS 2013, pages 101–104, 2013
Managing Short Postings Lists
Andrew Trotman, Xiang-Fei Jia, Matt Crane
ADCS 2013, pages 113–116, 2013
Maintaining Discriminatory Power in Quantized Indexes
Matt Crane, Andrew Trotman, Richard O'Keefe
CIKM 2013, pages 1221–1224, 2013
Diversified Relevance Feedback
SIGIR 2013, page 1142, 2013
Posters & Presentations
Designing a Hash Function for Information Retrieval
Presentation for the Computer Science and Information Science Seminars (July 2014)
Pipes, Trees & Hash
Presentation for the Otago Computer Science Systems Group Seminar (May 2014)
Effects of Spam Removal on Search Engine Efficiency and Effectiveness
Presentation for ADCS2012
Diversification in Information Retrieval
Presentation for Otago Computer Science Postgraduate Symposium 2012
The Cold-Start Problem in Collaborative Filtering
Presentation for Otago Computer Science Postgraduate Symposium 2010
Information Processing & Management (Elsevier)
Information Retrieval Journal (Springer)
Transactions on Information Systems (ACM)
ICTIR2017 — 3rd ACM International Conference on the Theory of Information Retrieval
ADCS2016 — 21st Australasian Document Computing Symposium
ADCS2015 — 20th Australasian Document Computing Symposium
ADCS2014 — 19th Australasian Document Computing Symposium
Teaching & Job History
Instructor — University of Waterloo
- CS241: Foundations of Sequential Programming [Fall 2016]
Lab Demonstrator — University of Otago
A lab demonstrator has similar responsibilities to a teaching-assistant, helping students with their practical lab work. This demonstrating was undertaken while studying for both MSc (COMP150) and PhD (COSC241, COSC242, COSC244).
- COMP150: Practical Programming (in Python) 
- COSC241: Programming and Problem Solving [2013, 2014, 2015]
- COSC242: Algorithms and Data Structures [2014, 2015]
- COSC244: Data Communications, Networks and the Internet [2014, 2015]
Assistant Research Fellow — University of Otago
Tasked with investigating the areas within the indexing process that the rest of the system were waiting on, and analysing these areas.
Research Assistant — University of Otago
A research assistant is typically employed on a short-term contract in order to assist staff members with an ongoing research project. I have been involved in two of these:
Relevance and Readability 
Working on incorporating readability metrics into a search engine to re-rank results in order to return readable as well as relevant results. Involved working on a large
C++ codebase (ATIRE) worked on by multiple people. Undertaken simultaneously with MSc study.
Collaborative Filtering Improvement [2008–2009]
Working on improving the predictions made by implementing high level algorithms for my collaborative filtering system I developed as part of my honours degree project.
I have been heavily involved in the development of the ATIRE search engine. ATIRE is a research search engine written in
C++ and has been demonstrated to be fast at both indexing and searching.
I wrote the memcached exporter for the Prometheus monitoring system. This has since been incorporated as a project under the Prometheus Github organization.
I did the majority of the background running and checking of scripts for the Reproducibility Challenge.
As mainly a learning exercise, I designed and built a community site for the game Hearthstone. The site allowed players to install a client which would upload game logs to a server. These logs would then be parsed to provide an accurate play-by-play web-viewable version of the game. During this exercise I learnt a lot about the Docker and Amazon AWS eco-systems, as well as modern front-end development suites such as React.js.