CodeGemma Vs CodeLlama
Google recently released the CodeGemma model, an LLM trained to handle coding-related tasks. It claims the model to be faster and more performant than CodeLlama, Meta's equivalent Code LLM (see benchmarks below). We decided not to take their word for it and ran tests of our own using Msty's convenient split chats feature.
Note: We will be using the 7B
and 7B Instruct
variants with similar configurations for both models. If you are interested in exploring further, CodeGemma and CodeLlama are available for download directly through Msty.
Code Completion
Let's start with one of the most basic (and probably one of the first) tasks we learn to solve as programmers - writing a function to reverse a string. Let's ask CodeGemma and CodeLlama to write it for us this time in JavaScript since most of us are already familiar with the programming language.
CodeGemma answered with included explanations whereas CodeLlama just provided the answer. Also notable is the approach to the problem by the two models. CodeGemma provided a manual solution without using JavaScript's built-in methods. CodeLlama chose to use the built-in reverse()
method.
Code Review / Debugging
LLMs can be useful in spotting errors in data that could be overlooked by the human eye. In this particular example, we asked the models to debug a VueJS code snippet where we directly modify the value of a prop.
CodeGemma promptly identified the problem with our code and suggested that we should not modify the value of a prop in a child component. On the other hand, CodeLlama's response was a bit off-topic. It reported that the problem was actually with our syntax rather than the implementation. Both of the syntaxes are correct but CodeLlama completely missed out on the main problem with our code.
Unit Testing
Using the reverseString()
function that CodeGemma generated earlier to reverse a string in JavaScript, let's ask our models to write test cases for us.
We noticed that CodeGemma's test cases were pretty comprehensive and covered various edge cases - while CodeLlama's tests only covered a few basic scenarios.
Text-to-SQL Generation
Code LLMs excel at generating complex database queries. Some models like DuckDB NSQL and SQL Coder are specifically trained for this purpose. In the following example, we gave CodeGemma and CodeLlama a MySQL schema that tracks the attendance of students in classrooms and asked them both to write a query to get the total attendance of a particular classroom on a particular date.
CodeLlama took a shortcut approach to the solution and directly queried the attendance table where classroom_id
was 'MEB-1'. This would not work as expected because the classroom_id
column values are of INT
type. The correct solution would be the one proposed by CodeGemma where you merge the tables and then filter on the name of the classroom after the join.
Coding Interview Questions
The Instruct
variants of LLMs are trained to respond in natural language which allows them to converse in a more human-friendly manner. This means we can leverage the models for tasks like interview preparation and quizzes. We can also set system instructions on the models to further fine-tune their conversational style.
For this exercise, we used the 7B Instruct
variants of the models and asked them a question about space complexity for a function from Cracking the Coding Interview book. The given function adds adjacent elements between 0 and n.
The correct answer for the space complexity of this function is O(1) but CodeLlama insisted that the complexity is O(n). While both models mentioned that the for loop iterates from 0 to n-1, only CodeGemma noted that this doesn't create new objects or data structures in memory thus resulting in the space complexity of O(1).
Overall, we can see that Google's CodeGemma 7B is much superior to CodeLlama 7B in handling diverse coding problems. It's also better at writing unit tests and analyzing code issues and complexities.
During our tests, CodeLlama answered most queries incorrectly and its correct answers were often either limited or it didn't cover a lot of edge case scenarios.
CodeLlama also seemed to struggle with providing language-aware markdown formatting for generated code snippets.
That's it for CodeGemma vs CodeLlamma! You can explore more LLMs and compare their responses side-by-side by downloading Msty and using our split chats feature.