Written by: Joe Doliner, CEO and Co-Founder of Pachyderm
2018 was an exciting year for Pachyderm. The growth we saw for the project, user base, and the company has surpassed our expectations. To give you an idea, Pachyderm’s codebase had more than 1,800 commits added, 1.5 million lines changed, and 24 new external contributors. We saw a spike in community generated content as well. Key examples include an excellent and in-depth blog written by Dr. Samantha Zeitlin the lead data scientist at Denali publishing and a great talk on modern data science delivered at Go Northwest by Sam Kreter, a software development engineer at Microsoft.
Furthermore, we had more than one thousand new users joining our support channels to ask questions and provide others with advice. It’s very rewarding for us to see the momentum build around Pachyderm and based on what we observed in 2018, it would seem we’re well on our way to building a sustainable open source community.
Community momentum also fueled Pachyderm’s commercial growth in 2018. While helping numerous customers reshape data science within their organization; Pachyderm released two major versions of the product, created a few case studies, and welcomed five new team members. We even managed to raise some money. Needless to say, we’re thrilled. But that was sooo last year…
Here’s a quick look at what you can expect from the Pachyderm project this year
Pachyderm pipelines will be extended to support even more complex workflows and heavier workloads. This will enable us to enhance Pachyderm pipelines while also tightening our integration with Kubernetes and the surrounding ecosystem:
You can find out more information here: https://github.com/pachyderm/pachyderm/issues/3345
This year we want to expand the many different ways you can integrate data into and out of Pachyderm:
This one is already underway and you can follow along here: https://github.com/pachyderm/pachyderm/pull/3432
Performance is paramount when it comes to applied data science, full stop. In 2018, we made drastic performance improvements to Pachyderm, and 2019 will be no different. Users will continue to see performance improvements for nearly every type of workload throughout the year.
This one we're keeping a bit closer to the chest, as it's too early to provide details. What we will say is that it's a natural continuation of our mission to enable reproducible data science and facilitate collaboration, but on a global scale. How we're going to do that is by making Pachyderm more accessible and eliminating infrastructure obstacles that stand in the way of real-world ML/AI.