Large Language Models
Large Language Models for Code Generation – Part 1
May 17, 2023 by Ofer Mendelevitch | 7 min read
Read NowBlog Post
Large Language Models
Will code-generating LLMs replace developers? Probably not anytime soon
May 18, 2023 by Ofer Mendelevitch
In part 1, we explored what code-generating LLMs are and how they work. We saw some examples of how developers can use code-generating LLM technology to be more productive and efficient.
As a developer myself, I can’t be more excited about tools like StarCoder and GitHub Copilot for code completion. What else can it help me with?
Let’s take a look at some advanced features of code-generating LLMs.
If we look at Copilot Labs (a VSCode extension), it shows us a hint of what’s coming next:
Can code-generating LLMs help us with code refactoring? Yes, they sure can.
I took the code example from PyCharm’s refactoring tutorial of simplifying rational numbers.
Me: can you please refactor the following python code:
from collections import namedtuple
class Rational(namedtuple('Rational', ['num', 'denom'])):
def __new__(cls, num, denom):
if denom == 0:
raise ValueError('Denominator cannot be null')
if denom < 0:
num, denom = -num, -denom
x = abs(num)
y = abs(denom)
while x:
x, y = y % x, x
factor = y
return super().__new__(cls, num // factor, denom // factor)
def __str__(self):
return '{}/{}'.format(self.num, self.denom)
GPT-4 not only does a great job, it also explains what it did:
Here’s the refactored code:
from collections import namedtuple
from math import gcd
class Rational(namedtuple('Rational', ['num', 'denom'])):
def __new__(cls, num, denom):
if denom == 0:
raise ValueError('Denominator cannot be null')
factor = gcd(abs(num), abs(denom))
# Ensure the denominator is always positive
sign = -1 if num * denom < 0 else 1
num = sign * abs(num) // factor
denom = abs(denom) // factor
return super().__new__(cls, num, denom)
def __str__(self):
return '{}/{}'.format(self.num, self.denom)
In this refactored code, I have:
This is really awesome. Great refactoring and also teaches you a thing or two about the latest best practices of code refactoring.
GPT-4 can also help with refactoring for speed. Here’s an example:
Me: “Please refactor this function to run faster:
def fibonacci(n):
if n == 0:
return 0
elif n == 1:
return 1
else:
return fibonacci(n - 1) + fibonacci(n - 2)
GPT-4:
The given function uses a naive recursive approach which has exponential time complexity. We can improve the performance by using memoization to store intermediate results and avoid redundant calculations. Here’s a refactored version using memoization:
def fibonacci(n, memo=None):
if memo is None:
memo = {}
if n == 0:
return 0
elif n == 1:
return 1
elif n in memo:
return memo[n]
else:
memo[n] = fibonacci(n - 1, memo) + fibonacci(n - 2, memo)
return memo[n]
This version of the function will run significantly faster, as it only computes the Fibonacci numbers for each value of n
once and reuses the stored results.
Can I point code generating LLMs to my internal code and have them provide answers specific to me? Yes, of course, that’s what GitHub Copilot does – it provides responses based on the context of your code inside VSCode.
But that means my code gets sent to the model and this has potential privacy implications. For example – would machine learning engineers developing Bard at Google trust that their code doesn’t leak to OpenAI if they use this? Probably not.
We’ve seen how incredibly useful code generating LLMs can be, but this leads to the obvious question: will they get so good that they will eventually replace developers altogether?
I honestly don’t think so. At least not anytime soon.
Software development is not only about working on your laptop and being a “code monkey”. It’s a lot more than that. To create quality software, you need to understand the requirements, design the right architecture, choose deployment platforms, and make various design decisions along the way.
I see code generating LLMs as an amazingly powerful productivity tool that helps developers focus on what really matters, instead of memorizing “how do I do X in javascript” or “what is the method name for doing Y in Pandas”. Developers will still be responsible for design and architecture, will need to review and test the code generated by LLMs, do broader re-factorings, and much more.
Not long ago during a Vectara technical interview, the interviewer realized that the candidate had GitHub copilot enabled in their environment, but nevertheless allowed that candidate to continue using it.
That is a true story – the point is: there is much more to excellence in software engineering than what GitHub Copilot provides.
At Vectara we don’t focus on code-generating LLMs as part of our platform, but our engineering team fully enjoys the benefits it provides to improve our productivity.
You can, however, use Vectara’s powerful neural search and LLM-powered application platform in a few interesting ways:
Get started with our free plan here.
Large Language Models
May 17, 2023 by Ofer Mendelevitch | 7 min read
Read Nowgrounded generation
May 2, 2023 by Ofer Mendelevitch | 10 min read
Read NowLangChain
June 15, 2023 by Ofer Mendelevitch | 8 min read
Read Now