Uncategorized
The Great Search Disruption
June 8, 2023 by Ofer Mendelevitch CJ Cenizal | 5 min read
Read NowBlog Post
Large Language Models
Will code-generating LLMs replace developers? Probably not anytime soon
May 18, 2023 by Ofer Mendelevitch
In part 1, we explored what code-generating LLMs are and how they work. We saw some examples of how developers can use code-generating LLM technology to be more productive and efficient.
As a developer myself, I can’t be more excited about tools like StarCoder and GitHub Copilot for code completion. What else can it help me with?
Let’s take a look at some advanced features of code-generating LLMs.
If we look at Copilot Labs (a VSCode extension), it shows us a hint of what’s coming next:
Can code-generating LLMs help us with code refactoring? Yes, they sure can.
I took the code example from PyCharm’s refactoring tutorial of simplifying rational numbers.
Me: can you please refactor the following python code:
from collections import namedtuple
class Rational(namedtuple('Rational', ['num', 'denom'])):
def __new__(cls, num, denom):
if denom == 0:
raise ValueError('Denominator cannot be null')
if denom < 0:
num, denom = -num, -denom
x = abs(num)
y = abs(denom)
while x:
x, y = y % x, x
factor = y
return super().__new__(cls, num // factor, denom // factor)
def __str__(self):
return '{}/{}'.format(self.num, self.denom)
GPT-4 not only does a great job, it also explains what it did:
Here’s the refactored code:
from collections import namedtuple
from math import gcd
class Rational(namedtuple('Rational', ['num', 'denom'])):
def __new__(cls, num, denom):
if denom == 0:
raise ValueError('Denominator cannot be null')
factor = gcd(abs(num), abs(denom))
# Ensure the denominator is always positive
sign = -1 if num * denom < 0 else 1
num = sign * abs(num) // factor
denom = abs(denom) // factor
return super().__new__(cls, num, denom)
def __str__(self):
return '{}/{}'.format(self.num, self.denom)
In this refactored code, I have:
1. Imported gcd function from the math module to simplify the calculation of the greatest common divisor (GCD) between the numerator and denominator.
2. Removed the loop for calculating the GCD and replaced it with the gcd() function.
3. Simplified the handling of the sign for the numerator and denominator by using a sign variable.
4. Updated the formatting in the str method to match the rest of the code.
This is really awesome. Great refactoring and also teaches you a thing or two about the latest best practices of code refactoring.
GPT-4 can also help with refactoring for speed. Here’s an example:
Me: “Please refactor this function to run faster:
def fibonacci(n):
if n == 0:
return 0
elif n == 1:
return 1
else:
return fibonacci(n - 1) + fibonacci(n - 2)
GPT-4:
The given function uses a naive recursive approach which has exponential time complexity. We can improve the performance by using memoization to store intermediate results and avoid redundant calculations. Here’s a refactored version using memoization:
def fibonacci(n, memo=None):
if memo is None:
memo = {}
if n == 0:
return 0
elif n == 1:
return 1
elif n in memo:
return memo[n]
else:
memo[n] = fibonacci(n - 1, memo) + fibonacci(n - 2, memo)
return memo[n]
This version of the function will run significantly faster, as it only computes the Fibonacci numbers for each value of n
once and reuses the stored results.
Can I point code generating LLMs to my internal code and have them provide answers specific to me? Yes, of course, that’s what GitHub Copilot does – it provides responses based on the context of your code inside VSCode.
But that means my code gets sent to the model and this has potential privacy implications. For example – would machine learning engineers developing Bard at Google trust that their code doesn’t leak to OpenAI if they use this? Probably not.
We’ve seen how incredibly useful code generating LLMs can be, but this leads to the obvious question: will they get so good that they will eventually replace developers altogether?
I honestly don’t think so. At least not anytime soon.
Software development is not only about working on your laptop and being a “code monkey”. It’s a lot more than that. To create quality software, you need to understand the requirements, design the right architecture, choose deployment platforms, and make various design decisions along the way.
I see code generating LLMs as an amazingly powerful productivity tool that helps developers focus on what really matters, instead of memorizing “how do I do X in javascript” or “what is the method name for doing Y in Pandas”. Developers will still be responsible for design and architecture, will need to review and test the code generated by LLMs, do broader re-factorings, and much more.
Not long ago during a Vectara technical interview, the interviewer realized that the candidate had GitHub copilot enabled in their environment, but nevertheless allowed that candidate to continue using it.
That is a true story – the point is: there is much more to excellence in software engineering than what GitHub Copilot provides.
At Vectara we don’t focus on code-generating LLMs as part of our platform, but our engineering team fully enjoys the benefits it provides to improve our productivity.
You can, however, use Vectara’s powerful neural search and LLM-powered application platform in a few interesting ways:
Get started with our free plan here.
Uncategorized
June 8, 2023 by Ofer Mendelevitch CJ Cenizal | 5 min read
Read Nowgrounded generation
by Ofer Mendelevitch | 2 min read
Read Nowgrounded generation
May 31, 2023 by Justin Hayes | 13 min read
Read Nowgrounded generation
May 30, 2023 by Shane Connelly CJ Cenizal | 3 min read
Read NowLarge Language Models
May 17, 2023 by Ofer Mendelevitch | 7 min read
Read NowLarge Language Models
May 16, 2023 by Ofer Mendelevitch | 8 min read
Read Nowgrounded generation
May 2, 2023 by Ofer Mendelevitch | 10 min read
Read NowSample App
April 4, 2023 by Ofer Mendelevitch CJ Cenizal | 6 min read
Read NowLarge Language Models
March 28, 2023 by Suleman Kazi Adel Elmahdy | 9 min read
Read Now