This startup uses a team of AI agents to write and review their pull requests

enJune 07, 2024

Podcast Summary

Future of junior engineering roles: Advancements in AI and machine learning may reduce the need for traditional junior engineering roles, while the labor market shifts towards profitability over growth presents challenges for some tech workers. Startups like Squire AI are innovating to help developers add new meaning to their codebase.
The future of software development may involve less reliance on traditional junior engineering roles due to advancements in AI and machine learning. Meanwhile, the labor market is experiencing a shift towards profitability over growth, making it challenging for some tech workers to find new jobs. Samuel Patel, CEO and cofounder of Squire AI, shared his background in computer science and gaming that led him to a career in software development and eventually founding his own startup. Squire AI initially focused on just-in-time documentation for developers but had to pivot when large language models like LLM emerged. Patel and his team are now building tools to help developers give new meaning to their codebase, making it an interesting addition to the competitive landscape, including Stack Overflow for Teams. The conversation also touched on the evolution of Squire AI and the challenges they faced in the ever-changing software development landscape.
Squire AI agents evolution: Squire AI evolved from just-in-time documentation to code review assistance using LLMs, aiming to integrate throughout SDLC for developers assistance
Squire AI is a suite of agents designed to automate smaller tasks within the software development life cycle. The evolution of Squire AI began with just-in-time documentation, but the emergence of Large Language Models (LLMs) led them to pivot towards creating agents that could help developers understand code ownership and responsibilities. The latest iteration, Squire AI, aims to provide constructive feedback during code reviews by traversing the codebase, searching for symbols, meaning, and context to ensure code quality and adherence to best practices. The future of agents, according to Squire AI, is in their atomicity and ability to work together in a multi-agent system to tackle increasingly complex tasks. These agents employ techniques such as reflection, tool use, planning, and collaboration to provide valuable feedback and utilization of each other. Today, Squire AI focuses on code reviews, but the ultimate goal is to integrate these agents throughout the entire software development life cycle to assist developers at every stage.
Master Models with fine-grain control: Future AI development will create master models capable of leveraging multiple models for specific tasks, offering opinions and suggestions, and collaborating with humans effectively.
The future of AI development is heading towards the creation of master models with fine-grain control over individual agents' knowledge and the ability to leverage multiple models for specific tasks. These models will not only be able to reason and make decisions but also offer opinions and suggestions, acting more like a senior employee. The Hugging GPT paper is an example of this direction, where the model can find and use other models to complete tasks. AI agents, such as Reflection, will provide criticism and suggestions, and tools like tree of thought can help determine the best possible path to improve outcomes. The consensus is that we're moving towards a future where AI will be able to reason, make decisions, offer opinions, and collaborate with humans in a more effective and efficient way.
Agentic workflows for LLMs: Agentic workflows allow LLMs to focus on specific tasks, eliminating confusion and leading to efficient and accurate usage through a 'for loop' system where LLMs can use other agents as tools and maintain control over outcomes.
The future of Large Language Models (LLMs) lies in their ability to exhibit divergent thought and generate specialized outputs. This approach, known as agentic workflows, involves training models to focus on specific tasks and eliminating confusion. The use of smaller, specialized models, like CodeLama, can lead to similar or better outcomes than using large, heavy models for every task. However, there's a risk of models becoming overly specialized and losing proficiency in other areas. Companies like ours are using a variety of technologies, such as Python, TypeScript, graph databases, and embeddings, to build these systems. Agentic workflows involve putting LLMs in a "for loop," allowing them to think, act, and reconsider their actions. Our system enables agents to use other agents as tools, ensuring syntactical accuracy and maintaining control over the outcomes. This approach can lead to more efficient and accurate LLM usage.
OpenAI agent hierarchy and business model: OpenAI is developing a hierarchy of agents that work together to achieve specific outcomes, with a focus on per-seat pricing, predictable costs, and specific models for balancing cost and value. Innovations in data centers, energy, and compute resources are needed to support the future agentic workforce, with OpenAI exploring the possibility of selling excess compute power as heat.
OpenAI is developing a hierarchy of agents that work together to achieve specific outcomes, with the parent agent being the one users interact with most. This agent interfaces with various tools and other agents to understand code structures and provide the desired outcome. OpenAI's focus on per-seat pricing aims to provide predictable costs and control expenses, as usage-based pricing can be unpredictable. They are also developing more specific models to balance cost and value, with smaller models used for specific tasks. The increasing demand for AI agents will require innovations in data centers, energy, and compute resources to support the future agentic workforce. OpenAI is also exploring the possibility of selling excess compute power as heat. The cost of inference remains high, but OpenAI is working on new techniques for efficiency and has recently made some new offerings free to the public. The business model revolves around providing value to businesses while managing costs. The development of these agents and the increasing demand for AI technology will necessitate innovations in various areas to support the future workforce.
Energy efficiency in AI: Addressing energy constraints is crucial for maximizing efficiency and value from AI and data centers. Renewable energy solutions and specialized models can help reduce energy consumption.
There's an opportunity to maximize efficiency and extract more value from AI and data centers by addressing the energy issue. The discussion highlighted the potential bottleneck of building new data centers due to energy constraints, as well as the need for more advanced grids to effectively transfer renewable energy. The future of AI lies in people owning their own AI and having energy-efficient computers in their homes. Additionally, being selective about the models used based on the task at hand can help save costs and reduce energy consumption. While there will still be a place for large, general models, specialized models will likely take over as tasks become more specific. Overall, it's important to consider energy efficiency and the potential for renewable energy solutions to support the growth of AI technology.
Specialized models vs sharing knowledge: Specialized models are important for energy and cost efficiency in completing specific tasks, while sharing knowledge within the tech community can benefit thousands through platforms like Stack Overflow.
As technology advances, we can expect to see increasingly specialized models being used for specific tasks due to energy and cost efficiency. For instance, there are models designed specifically for generating new ideas for CRISPR proteins, which an average language model might not be able to do. Meanwhile, in the world of programming, a great example of shared knowledge comes from Bharath Haba, who asked a question on Stack Overflow about disabling source maps for React JS applications. This question helped over a thousand people and received a great answer with 40 upvotes. These examples highlight the importance of both specialized models and the sharing of knowledge within the tech community. If you're interested in contributing to this community, you can join the conversation on Stack Overflow or listen to the podcast for engaging discussions on various tech topics. And remember, leaving a rating and review is the nicest thing you can do besides sending money and free swag.

Recent Episodes from The Stack Overflow Podcast

The problem with the tech debt mindset

Chelsea Troy defines technical debt and maintenance load in her blog post, “Stop saying ‘technical debt.’”

Learn more about technical bankruptcy in this blog post, “Monitoring debt builds up faster than software teams can pay it off.”

Joel Spolsky’s classic blog post on avoiding rewriting code from scratch – Things you should never do, part I.

Technical debt as explained by Ward Cunningham, who coined the term.

Code as an asset, a conversation from Hacker News.

Middleware is the “software glue” that provides services to applications beyond those available from the operating system.

Ratpack framework is a toolkit for creating high performance web applications.

React is a front end javascript library.

jQuery is a JavaScript library designed to simplify HTML.

Questions about functional programming.

User shout out! Nikoksr received the lifeboat badge after answering a question related to math.pow.

The Stack Overflow Podcast

enJuly 23, 2024

Java, but why? The state of Java in 2024

You can connect with Lenny Primak at Flow Logix, X, LinkedIn, Github, or Mastodon.

Got questions about Java? Check out the site.

Apache Groovy is a Java programming language.

Virtual Threads reduce the effort put into writing and maintaining code as well as observing high-throughput concurrent applications.

Apache Shiro is an open-source security framework that can do authentication, authorization, cryptography, and session management.

Jakarta EE, or Jakarta Enterprise Edition, is a suite of services that helps developers write enterprise applications for the Java platform.

The Stack Overflow Podcast

enJuly 19, 2024

java

programming languages

The framework helping devs build LLM apps

LlamaIndex is a data framework for building LLM applications. Check out the open-source framework or get started with the developer community, LlamaHub.

Looking for a deeper understanding of RAG? Start with our guide.

Wondering how to import `SimpleDirectoryReader` from LlamaIndex? This question has you covered.

Jerry Chen is a partner at Greylock. Connect with him on LinkedIn.

Read Jerry Lu’s posts on the LlamaIndex blog or connect with him on LinkedIn.

The Stack Overflow Podcast

enJuly 16, 2024

llm

gen ai

generative ai

retrieval augmented generation

prompt engineering

rag

llamaindex

Why we built Staging Ground

Learn more about Staging Ground on our blog or in the help center.

Find Kyle on LinkedIn, GitHub, and Twitter.

Spevacus is a full stack developer and Stack Overflow moderator. They’re a participant in Charcoal, a user-run group that fights spam and rude/abusive content across the Stack Exchange network.

The Stack Overflow Podcast

enJuly 12, 2024

We chat search from both sides now

Stack Overflow and Elastic are collaborating to improve the search experience using vector search and generative AI. Learn more about the new AI features for Stack Overflow for Teams, including Enhanced Search.

Learn more about the Elastic platform, including vector search. Developers can start building here.

Connect with Paul, Steffi, and Gregor on LinkedIn.

Stack Overflow user chepner won a Lifeboat badge for answering How do I use __repr__ with multiple arguments?.

The Stack Overflow Podcast

enJuly 09, 2024

What can devs do about code review anxiety?

Carol is an applied clinical and intervention scientist: she develops and tests cognitive, behavioral, and social interventions that activate key mechanisms to elicit change. Learn more about understanding and mitigating code review anxiety (the full version of her article is here).

You can also check out the code review anxiety workbook.

Pluralsight’s Developer Success Lab is a team of scientists studying how developers work, learn, and innovate.

Explore more of Carol’s work on code review anxiety, her bio, or her other work, from developer productivity and stress management to coding with GenAI.

Connect with Carol on LinkedIn or Mastodon.

The Stack Overflow Podcast

enJuly 05, 2024

developer success lab

Happy people make better products

Still thinking about developer happiness and productivity? Read Eira’s article about the real 10x developers among us.

Connect with Ben Borra through his website or LinkedIn.

Asked and answered: Stack Overflow user Jian earned a Great Question badge with How do I close a frozen SSH session?.

The Stack Overflow Podcast

enJuly 02, 2024

developer experience

feedback

developer productivity

developer happiness

coding assistants

How to build open source apps in a highly regulated industry

Before Medplum, Reshma founded and exited two startups in the healthcare space – MedXT (managing medical images online acquired by Box) and Droplet (at-home diagnostics company acquired by Ro). Reshma has a B.S. in computer science and a Masters of Engineering from MIT.

You can learn more about Medplum here and check out their Github, which has over 1,200 stars, here.

You can learn more about Khilnani on her website, GitHub, and on LinkedIn.

Congrats to Stack Overflow user Kvam for earning a Lifeboat Badge with an answer to the question:

What is the advantage of using a Bitarray when you can store your bool values in a bool[]?

The Stack Overflow Podcast

enJune 28, 2024

A very special 5-year-anniversary edition of the Stack Overflow podcast!

Cassidy reflect on her time as a CTO of a startup and how the shifting environment for funding has created new pressures and incentives for founders, developers, and venture capitalists.

Ben tries to get a bead on a new Moore’s law for the GenAI era: when will we start to see diminishing returns and fewer step factor jumps?

Ben and Cassidy remember the time they made a viral joke of a keyboard!

Ryan sees how things goes in cycles. A Stack Overflow job board is back! And what do we make of the trend of AI assisted job interviews where cover letters and even technical interviews have a bot in the background helping out.

Congrats to Erwin Brandstetter for winning a lifeboat badge with an answer to this question: How do I convert a simple select query like select * from customers into a stored procedure / function in pg?

The Stack Overflow Podcast

enJune 25, 2024

Say goodbye to "junior" engineering roles

How would all this work in practice? Of course, any metric you set out can easily become a target that developers look to game. With Snapshot Reviews, the goal is to get a high level overview of a software team’s total activity and then use AI to measure the complexity of the tasks and output.

If a pull request attached to a Jira ticket is evaluated as simple by the system, for example, and a programmer takes weeks to finish it, then their productivity would be scored poorly. If a coder pushes code changes only once or twice a week, but the system rates them as complex and useful, then a high score would be awarded.

You can learn more about Snapshot Reviews here.

You can learn more about Flatiron Software here.

Connect with Kirim on LinkedIn here.

Congrats to Stack Overflow user Cherry who earned a great question badge for asking: Is it safe to use ALGORITHM=INPLACE for MySQL?

The Stack Overflow Podcast

enJune 21, 2024

developer productivity