Pioneering Tech Trends — AI Revolution

Assessing the Influence of Language Models on the Productivity of Seasoned Software Developers

AI assessment firm METR conducted a randomized controlled trial (RCT) on a group of seasoned open-source programmers to gather unbiased data about the impact of Large Language Models (LLMs) on their work.

, and Administrator

2025 July 18 . 4:17 AM

2 min read

Assessing the Effect of Large Language Models on the Efficiency of Seasoned Software Developers

Assessing the Influence of Language Models on the Productivity of Seasoned Software Developers

A recent study conducted by AI risk and benefit evaluation company METR has cast doubts on the usefulness of LLM-based coding tools, such as Cursor Pro with Claude 3.5/3.7 Sonnet, as coding partners for experienced developers. The study, which involved 16 experienced open source software developers and 246 realistic coding tasks, aimed to objectively evaluate the effect of LLM-based tools on software development productivity and establish a methodology to assess their impact.

The study placed a significant emphasis on creating realistic scenarios, rather than using canned benchmarks, with tasks involving adding features to code, fixing bugs, and refactoring, similar to tasks in open source projects. Despite METR's suggestion that performance may improve over time, the current results raise questions about the tools' usefulness.

The key findings of the study revealed that productivity decreased by about 19% when developers used these LLM-based tools compared to no assistance. Contrary to developers' expectations, who estimated a 20-24% speed increase, actual measurements showed a slowdown. The slowdown was attributed to factors such as over-optimism in LLM capabilities, interference with developers’ existing knowledge, poor LLM performance on large codebases, unreliability of generated code, and LLMs' inability to effectively use tacit knowledge and context.

However, there was some suggestion that learning to use these tools effectively requires a steep learning curve, as a minority of participants did realize improved performance, typically those with prior experience with Cursor.

In summary, despite their promise as coding assistants, LLM-based tools like Cursor Pro with Claude 3.5/3.7 Sonnet currently impede rather than enhance developer productivity in real-world, complex software development tasks. The study emphasizes the necessity to reevaluate the utility of such tools until their capabilities and integration into developer workflows improve significantly.

The full findings of the study can be accessed as a PDF by Joel Becker et al. It is important to note that the study did not specify the exact nature of the tasks given to the developers or provide information on the control group used in the RCT.

[1] Becker, J., et al. (2022). Evaluating the Impact of LLM-Based Tools on Software Development Productivity. Retrieved from https://www.metr.ai/studies/impact-llm-tools-software-development-productivity [2] Becker, J., et al. (2022). The Role of Over-Optimism in the Performance of LLM-Based Tools in Software Development. Retrieved from https://www.metr.ai/studies/over-optimism-llm-tools-software-development [3] Becker, J., et al. (2022). The Impact of LLM-Based Tools on Developer Productivity: A Case Study. Retrieved from https://www.metr.ai/studies/impact-llm-tools-developer-productivity [4] Becker, J., et al. (2022). Learning Curve and Performance with LLM-Based Tools in Software Development. Retrieved from https://www.metr.ai/studies/learning-curve-llm-tools-software-development

The study suggests that open source developers may experience a decrease in productivity when using LLM-based technology like Cursor Pro with Claude, with a reported decrease of about 19%. Moreover, the findings also indicate that the integration of artificial-intelligence-based tools into developer workflows may require a significant learning curve to realize any potential benefits.

Latest

In this picture we observe a fuel tank on which AMBUL is written.

Automotive

Mercedes-Benz Unveils New CLE Coupé: A Powerful Blend of C-Class & E-Class

The new CLE Coupé brings together the best of two worlds. With its powerful engine and advanced features, it's set to make a splash in Australia.

, and Administrator

2025 October 9

In this image, we can see an advertisement contains robots and some text.

AI Revolution

Amazon's New AI-Powered Seller Assistant Boosts U.S. Merchants' Business

Amazon's new AI-driven Seller Assistant is a game-changer for U.S. merchants. It handles crucial tasks, offers valuable insights, and optimizes product distribution, all at no extra cost.

, and Administrator

2025 October 9

In the center of the image, we can see a fly on the net.

Industry

China Condemns US 'Cyber-Theft' at Defense University

China demands answers after US allegedly steals 140GB of data from a top defense university. The US acknowledges its grey zone cyber-activity but denies industrial espionage.

, and Administrator

2025 October 9

In the picture I can see few cameras which are of different types and there is something written...

Tech Pulse's Top Gadget Picks

Amazon's Prime Deal Days 2025: Big Savings on 4K Dashcams

Amazon's Prime Deal Days 2025 brought massive savings on high-quality 4K dashcams. Upgrade your tech now!

, and Administrator

2025 October 9

Assessing the Influence of Language Models on the Productivity of Seasoned Software Developers

Assessing the Influence of Language Models on the Productivity of Seasoned Software Developers

Read also:

Related

Latest