This is a preview of a non-profit social benefit organization designed to accelerate academic publishing, starting with Computer Science and Natural Language Processing research.
This organization tackles problems with academic software that authors and research groups struggle to address on their own, even with sufficient training. For example:
Months can be spent rewriting code that already exists in order to reproduce a result in a paper, often because the code used to develop the paper was never shared but sometimes because the code is available and is too hard to understand, build, or execute. Less discussed, another need is to be able to reproduce states of a model or algorithm during paper development, for example to retrain a model from an earlier point in time, to test a single step of an algorithm in isolation, or to better keep track of intermediate results.
Solutions: We’re excited to be working with the ACM’s Artifact Review and Badging committee to develop open source tools that help authors quickly containerize diverse software artifacts so they can submit more artifacts and earn badges.
We’re also reaching out to machine learning authors to learn about how we can store the history of their model changes and training steps to help them accelerate their current projects. If you’ve published papers or reports in machine learning, you can help.
Creating a modular and experiment-friendly open source project can take weeks to months of documentation and development work, time that most graduate students and faculty don’t have. Further, projects developed by industry are often “one size fits all”, meaning that they aren’t designed to give students and faculty the visibility or the flexibility they need. In the worst case, valuable academic software projects do exist but research groups end up abandoning them when original authors leave and no one in the group has the time to learn how to maintain them.
Solutions: We’re currently working with a research group in Chicago on open sourcing a mature Natural Language Processing project in a sustainable way. That means transforming an experimental code base into modular, reusable steps that can be written in any modern programming language. We’ll publish an in-depth case study covering the benefits of the new design soon.
Even if attracting new users can lead to more citations, rarely do authors have the time or the resources to help new users with their projects after publication. A small user base also means that there are limited opportunities to find outside contributors and sponsors to keep valuable projects going after original authors move on to other projects.
Solutions: The authors of a popular research database need a sustainable home for their project, which hasn’t been updated in several years. We’re offering to host the database, help to keep it updated, create a modern promotional website, and maintain documentation that attracts new users in both academia and industry. More details to come in Spring 2019 as this project progresses.
Valuable projects with broad reach, for example projects used by many similar research groups or projects used by both academia and industry, can become highly sustainable by finding sponsors willing to contribute intellectual, financial, and technical resources in exchange for being able to use the projects to gain new technical talent, users, or investment opportunities. Unfortunately, the average author doesn’t have the networks or the time to pitch to these potential sponsors. And without mediation, some sponsors may try to transform projects into something that primarily serves their needs, ignoring and effectively cutting off the original research communities.
Solutions: With a base in Silicon Valley, we’re able to work with government representatives, large technology companies, and venture capitalists that have strong incentives to invest in new Computer Science and Natural Language Processing research. As a social benefit organization and a neutral intermediary, we hope to encourage sponsors to invest in a broad range of projects.
Send us a quick note and we will do our best to reply within 48-72 hours. Address all messages to
Erin Dahlgren <firstname.lastname@example.org>. For example:
To help with machine learning reproducibility:
Subject: “Accpub machine learning”
To receive updates on current and future projects:
Subject: “Accpub update me”
To learn how you can get involved:
Subject: “Accpub get involved”
Hi, I’m Erin Dahlgren, the initiator of these projects. I have a background in Linguistics research (University of Chicago), startups (Y Combinator), and security research for virtualization at scale (Google). I’ll be managing communications, finding sponsors, and leading development of our initial projects.
Nothing for academic researchers. We may decide to charge outsiders (e.g. industry) for tools we develop as a way to raise revenue to support more projects in need of our help.
The working name of this organization is “The Rearmory”, which stands for “The Research Armory”.
We’re still working on our website. We’d love to know about your favorite well-designed informational websites: send links to
Erin Dahlgren <email@example.com>.
The Software Sustainability Institute is a fantastic organization that provides software training to academic researchers (sometimes in partnership with the Carpentries), awards stipends through a fellowship program to encourage workshops and training, and offers many other benefits not limited to educational materials and software consulting.
The organization described here takes a different, complimentary approach. Similar to the Linux Foundation and the Apache Foundation, it takes specific software and database projects under its wing and incubates them under an umbrella. It is not capable of providing the level of training that the Software Sustainability Institute or the Carpentries can provide, but it can lead in the development, investment, and promotion of valuable projects that don’t have enough time or resources to get to the next level.
We hope to work with the Software Sustainability Institute as an employer of research software engineers. We share their belief that research software engineers are fundamental in a world where most research is powered by software.
At the moment, we operate as a software development and consulting practice, free for individual academics at participating institutions in exchange for projects coming under our umbrella, similar to how the Linux Foundation and the Apache Foundation operate.
In general, the projects we work on have open source code. This gives academic researchers the visibility they need to explain and conduct experiments. With the permission of authors, we may explore adding closed-sourced features to projects that we would still offer to academics for free but we would sell to outsiders as a way to raise revenue to support more projects in need of our help.
Not at the moment. If you are interested in working as a research software engineer on open source infrastructure for Computer Science and Natural Language Processing research, send a message to
Erin Dahlgren <firstname.lastname@example.org> with a description of your research interests.
Definitely! We’d love to hear about what you’re doing to see if and how we can help. It’s possible that we can better justify investing in an area if we hear about similar projects or problems from multiple people. Send inquiries to
Erin Dahlgren <email@example.com> with a description of your project and any important deadlines.