Skip to main content

PyCon US 2024 Welcomes Seven Companies to Startup Row

It’s another year, and that means it’s time for another PyCon US. This event offers an opportunity to bring together a vibrant and diverse community of engineers, data scientists, researchers, students, and enthusiasts who use the Python programming language for work and for fun.

It also presents a unique opportunity to highlight entrepreneurs building the future of the Python ecosystem. 2024 marks the thirteenth year that PyCon US has invited early-stage companies to present their ventures on Startup Row. Tools and platforms like Pandas, Plotly, and more—now fixtures of the Python software ecosystem—were built by companies that were featured on Startup Row. And with each new year there’s an opportunity to catch a glimpse of what’s next.

PyCon US organizers and the Python Software Foundation are thrilled to welcome seven companies to Startup Row at PyCon US 2024.

The Startup Row Lineup at PyCon US 2024

DAGWorks

Alexander Hamilton and Aaron Burr might’ve been historic rivals, but the two open-source Python libraries named after them—Hamilton and Burr—work together quite nicely. The company behind Hamilton and Burr, DAGWorks, was founded with the goal of standardizing the way developers build data, LLM, and RAG pipelines, as well as managing and observing application state; both libraries also come with optional self-hosted UIs. Hamilton is the older of the two projects and is used in production by many companies, from banks to startups, to government departments, through to enterprise conglomerates. DAGWorks is based in the San Francisco Bay Area and was founded by Hamilton and Burr co-creators Stefan Krawczyk and Elijah ben Izzy, who previously worked together at Stitch Fix where they built the styling service’s MLOps stack. DAGWorks participated in Y Combinator’s Winter 2023 accelerator batch.

Dispatch

Building and maintaining distributed systems is a complex and often frustrating exercise. A given function might go rogue, either firing too frequently or silently failing to execute when needed. Dispatch is a startup aiming to make building reliable distributed systems a breeze. Its platform offers developers the ability to make any async function resumable by simply wrapping it with a @dispatch.function decorator, enabling durable workflows and robust error handling. Quoting a customer, ““I love the API design, just tuck in a decorator and you’re done, almost feels like cheating!” The company was founded by engineers with experience building high-scale distributed systems at companies like Meta, Segment, and Twilio, among others.

dltHub

Dealing with data can be a chore for developers. One has to actually get the data, clean it up, and that data has to get loaded somewhere for it to be useful. At least for that last step, there’s dlt (a.k.a. data load tool), an open source Python library that takes the hassle out of shepherding data from one place to another. The library features out of the box integrations for dozens of verified data sources and destinations, making it easy to fetch data from places like Facebook Ads, Pandas dataframes, or even Chess.com and then load data into destinations ranging from Databricks and Snowflake to Postgres and DuckDB. dltHub, the company behind the popular dlt library, is based in Berlin and New York City. “Today it is easier to pip install dlt and write a custom source in Python than to setup and configure a traditional ETL platform,” said CEO and co-founder Matthaus Krzykowski. The startup is backed by Dig Ventures and technical founders from companies like Huggingface, Miro, Matillion, and others.

Exaloop

What gets called “big” data is constantly shifting. Not too long ago, anything over a handful of gigabytes was pretty big. Now, working datasets can be several orders of magnitude larger, which presents a challenge for data scientists and developers who just want to write Python. Enter Exaloop. The MIT spinout company’s platform enables data scientists and engineers to write and run Python workflows that execute extremely fast (10x-100x faster than vanilla Python), even when dealing with truly massive data. The secret sauce: Codon, a high-performance Python compiler that uses LLVM. Codon has garnered a fair bit of attention from the developer community, racking up nearly 14,000 stars on Github. The company is planning to release a new entirely cloud-based platform, enabling users to leverage Exaloop’s tech and performance benefits conveniently within a browser window.

Pandas AI

Data analysis packages like Pandas are, to most users, pretty intuitive. Someone with a bit of Python knowledge can get the hang of these tools pretty quickly. But not everyone knows Python, and many others find the prospect of writing code intimidating. For those who prefer to ask questions rather than write complex queries, there’s PandasAI. PandasAI allows users to connect to a database or import a CSV, making data interaction conversational. “We want to make it easy for anyone, in any company, to be able to derive valuable insights from data. That shouldn’t be limited to a small team of engineers and analysts,” said PandasAI founder and CEO Gabriele Venturi. Users can write queries in natural language (e.g. “In my database of ecommerce transactions, which customer Id appears most often?”) and even command PandasAI to plot out the data they’re analyzing. The library is compatible with several LLM providers and model variants, reducing the risk of model lock-in. PandasAI is open source and has over 11,000 stars on Github. The company participated in Y Combinator’s Winter 2024 batch.

Pixee

When it comes to application security, many developers wish for the equivalent of magical fairy dust they can sprinkle on their codebase to fix bugs and vulnerabilities. Pixee is about as close as developers can get to that magical fairy dust. The D.C. metro area-based security startup is the maker of Pixeebot, a virtual security engineer that automatically triages and fixes potential security issues before they can be exploited in production. The bot is built on Codemodder, an open source framework created and maintained by Pixee, that can find and fix issues in Python and Java. Pixee intends to extend language support to JavaScript, and Node.js in the near future'

Trellis

There’s nothing better than clean, well-structured data. It’s easy to query, easy to work with, and easy to add value to any enterprise or personal use case. The problem is that data rarely comes so neatly packaged. Trellis helps developers ingest and extract insights from unstructured data sources like sales calls, regulatory filings, business contracts, large-scale web scrapes, and more. Trellis users can define their data schema in natural language, which is transformed into a structured SQL format that can integrate with hundreds of data sources and LLM-augmented workflows. The company is based in San Francisco and participated in Y Combinator’s Winter 2024 accelerator batch.

A Quick Look Back at 2023’s Startup Row Alumni

Startup founders make no small plans. Startups are inherently risky and not all are destined for greatness, but the 2023 batch of Startup Row companies seems to be on the path to success.

  • 🦄 Imbue, a generative AI company developing agentic models that can reason and code, raised $200M+ in Series B funding at a $1B+ valuation. Investors in that deal include NVIDIA, Astera Institute, Notion co-founder Simon Last, and former Cruise CEO Kyle Vogt, among others. The company was formerly known as Generally Intelligent.
  • Neptyne continued expanding its Python-included spreadsheet tools and launched a Google Sheets extension that brings real, live Python to Google’s spreadsheet software. Neptyne also launched API integrations, giving its users credits to call OpenAI, Bing web search, web page rendering through PhantomJsCloud, Google’s geocoder, and financial information within their spreadsheet environments.
  • Nixtla released TimeGPT (ArXiv). Considering the company’s whole raison d'être is building time series data modeling and projection tools, this seems like a great use of transformer models’ next-token prediction capabilities. Nixtla’s open source tools continue to gain popularity.
  • ❄️ Ponder Data, maker and maintainer of Modin, was acquired by Snowflake to expand the Python capabilities of its data cloud platform. Modin is used by hundreds of thousands of users and serves as a drop-in replacement for the ever popular Pandas library.
  • Predibase. A month or so after PyCon 2023, the company announced that it had raised another $12.2 million from Felicis Ventures to bring total Series A funding up to $28M. In November the company released its LoRA Exchange (LoRAX) framework to the open source community under an Apache 2.0 license. According to Predibase’s announcement, “[the framework] makes it possible to serve hundreds of fine-tuned LLMs at the cost of one GPU with minimal degradation in throughput and latency.” In February the company launched LoRA Land, an LLM playground featuring 25 fine-tuned open source LLMs that “rival or outperform GPT-4” on specific tasks.
  • Reflex is the new name of the company formerly known as Pynecone, and its open-source web application framework is garnering more attention than ever. Back in August the YC alumni company announced that it raised $5M in seed funding led by Lux Capital to continue building out its platform and develop and launch a revenue-generating hosting service for Reflex apps. The company’s open source framework has over 17,000 stars on Github.
  • Wherobots, a geospatial data platform founded by the creators of Apache Sedona, announced $5.5 million in seed funding to build out its enterprise platform. Apache Sedona has been downloaded over 20 million times and is among the top one percent of most-downloaded packages on PyPI.

Needless to say, we’re pleased to be at least a small part of these companies’ ongoing success.

Thank You’s & Acknowledgements

Startup Row is made possible by the trust, logistical support, and tireless efforts of countless folks.
A heartfelt thanks goes out to the organizers and volunteers who work diligently to ensure PyCon US is a positive and engaging event, both in-person and virtually.

We also extend our gratitude to the many entrepreneurs who, despite their undoubtedly busy schedules, applied to Startup Row. Your time and attention are valuable, and we appreciate the thought you put into your applications.

A special thanks goes to the selection committee. Scoring such a competitive applicant pool is a significant commitment, and their time and expertise is greatly appreciated.

Lastly, our deepest thanks to the 7 companies that traveled to Pittsburgh to share their innovations with over 2,700 PyCon US attendees. Thank you all once again.

Comments