AI Assisted Programming with Ravi Mody

Listen

Summary

In this episode of Into the Hopper, I sit down with Ravi Mody, a data science leader with 17 years of experience across companies like Spotify and Daily Harvest. We dig into how large language models—tools like ChatGPT, Claude, and Cursor—are changing the day-to-day reality of programming.

Ravi shares his unique perspective as someone who took a four-year break from hands-on coding while in management at Spotify, only to return right as LLMs were taking off. We discuss moving from syntax-level assistance to function-level code generation, the dangers and opportunities for junior developers, and why we’re both spending less time on Stack Overflow.

Transcript

Tim: Welcome to episode nine of the Into the Hopper podcast, back after maybe an 18-month hiatus. My name is Tim Hopper. I’m joined today by my friend Ravi Mody. Ravi has 17 years of data science and leadership experience across a variety of companies like Spotify and Daily Harvest.

Ravi and I have been discussing how large language models are impacting how we program. We’re seeing more and more tools like Cursor, and Copilot has actually been out for three years now, which is wild to me. Copilot in particular has become a valuable part of my daily workflow. I’m interested in how others are using these tools too. Sometimes it’s hard to know how broad the impact is, or if it’s just people on social media talking about it. Ravi’s own development has been really impacted, and I asked him to come chat about what that looks like day to day.

Ravi: Tim, thanks so much for having me. This is definitely a topic I think about a lot.

Tim: Do you want to expand on your intro anymore? I summarized your very long resume in one sentence.

Ravi: Yeah, I mean, when I talk about programming, I’m largely coming at it from the perspective of building machine learning systems. That’s my specialty—usually recommender systems specifically. So some of my use of LLM coding is probably very domain-specific.

Another piece of context: I’ve been programming since I was about 10, but I took a four-year break when I was at Spotify. I was there until February of 2024, and for those four years, I didn’t really program. I was a people manager. It was an amazing job, but when I joined Raya, my current company, I was rusty at programming. So the timing of this new generation of LLMs was really interesting.

Tim: That’s actually really helpful context. The timing is good. At the most basic layer, these tools are just being that mental assist and not requiring you to remember all the little things about syntax—things that might have been easy to forget over four years.

Ravi: For sure. On the syntax side, I save a ton of time compared to Stack Overflow or even going through docs. I still think it’s valuable to have some muscle memory around common functions, but Cursor and these other tools make it so easy to just say what you want and get functional code out of it.

Current Toolset

Tim: Why don’t we start with an overview of your current tools? Obviously for all of us, these tool chains are changing quickly. People are jumping back and forth between Anthropic and OpenAI as their chat tools improve. What’s your current setup?

Ravi: I have access to ChatGPT, Claude 3.5, and Cursor. I go back and forth between them. The honest truth is I’m actually very comfortable with a basic text editor. I’m not a backend engineer—I’m really building ML models. So I’ve always been most comfortable in almost a notepad-style of programming. I use BBEdit.

Tim: BBEdit’s your primary editor?

Ravi: Yeah, it has been for most of my career.

Tim: I don’t think you’ve ever mentioned that. That’s interesting.

Ravi: It’s kind of embarrassing, honestly. I don’t like to tell people. I think it makes you a different kind of programmer if you’re not using a real IDE.

Tim: A lot of people have really relied on it. So BBEdit doesn’t currently have any copilot or cursor-type integration in the editor—you’re going to the chat interfaces directly?

Ravi: Yeah. I’m honestly not even sure if it does. I’m not a power user of BBEdit. I literally just use it as a notepad with syntax highlighting.

Tim: That’s wild. I’m very interested to hear that. So on a day to day, what are you turning to AI tools to assist with?

Thinking at the Function Level

Ravi: One thing this means is that I’m not really getting code completion. Sometimes I’ll use Cursor because there is a little bit of friction otherwise. But I’ve never found that the speed of my typing is what really slows me down on projects. It’s usually one level of abstraction higher—taking a more complex idea and working at the level of functions and classes.

For example, if I need a function that transforms a dataset, normally I would have cleared an hour of my calendar, designed it out, whiteboarded it, and then typed it. The typing is like 10% of that time. Where I’m most comfortable is co-programming with the chat program. I say what I need, look at the code it generates, and essentially do a code review on it. I’m usually generating at the function level.

Hours become five to ten minutes now. That’s where I’ve been finding my productivity multiples. It’s not completing a little piece of syntax—it’s thinking at the function and class level.

Tim: That’s interesting. I do a lot of that, especially for utilities—trying to transform a file from one format to another, or restructuring a CSV. Just “give me a DuckDB command that’s going to restructure this the way I want.”

Ravi: One of my favorite things now is exactly that. My least favorite programming was always changing formats or saving objects. I love watching it churn out the code I hated writing—“save this JSON file to this folder and then load it from that folder.” I’m trying to automate away the parts I just did not enjoy about the job.

Tim: It’s interesting historically how the ability to do that has been such a superpower. Going back to Drew Conway’s data science Venn diagram from 13 years ago—that hacking skill, being able to manipulate data. So many people just don’t have that paradigm. It’ll be interesting to see if this opens the door for more people to do that kind of thing.

I was just doing something similar for personal reasons. My county lets you download all the real estate records in one massive 400-megabyte Excel file. It’s impossible to work with, but DuckDB is perfect for this. In the past, I’ve spent so much time hacking around with stuff like that. I kind of enjoy it in some ways, but now I just describe what I want and get those DuckDB commands out. It’s reinvigorating those little side trails that I did a lot of in the past but now—with a busy job and four kids—I just don’t want to spend time on anymore.

Ravi: Exactly. If I’m going to take two hours on a data project, how do I want to spend that time? I used to love being in the low-level code, but one of the things I’ve realized in my career is the higher level of abstraction you can think, the more productive you’ll be. Forcing myself to assume it can take care of the basic code, and then thinking one level of abstraction higher about how all the pieces connect—that’s been the biggest game changer for me.

Code Completion and Reduced Mental Overhead

Tim: I’m actually surprised you’re not using it more at the code completion level. That’s been the biggest change for me. I’ve always relied on some level of code completion in an editor, even just really dumb things like completing variable names. But I find it’s so good at recognizing patterns.

In Python, for example, there’s no great auto-importer. You use some new method from another package, and I can type it in, then in VS Code you can say “try to fix this error that this import doesn’t exist”—and it’s really bad at detecting that. But with Copilot, I jump up to my imports and Copilot almost always knows exactly what import I was trying to add. I don’t have to type out “from pandas import something.”

Ravi: There’s also something almost magical about when it types exactly what you were going to type. I do pull up Cursor every now and then when I need to get lower level, and at that point the code completion is really valuable.

Tim: The other thing is if you’re drawing some kind of structure. I often have something written out in JSON or YAML and I want to write that as Python objects. I’ll paste it into my editor, comment it out, start typing the Python, and it figures out really quickly how to translate that structure. I’ve spent many hours writing regex to do that kind of thing. Now it just does it.

Ravi: One thing I’ve been really enjoying is—and this is partially because of the chat interface—I’ll write a simple function and then, you know, in Python the typing and doc strings take quite a bit of time to type out. So what I’ve sometimes done is have a skeleton of a function and throw it into Claude, saying “This is what I’m trying to get out of this. Give me any advice, but also just write all of this boilerplate stuff.”

I think of that as code completion at the function level. I’m asking it to think cohesively about that function: “This is what the function needs to do, this is the context of why I’m doing it.” Something that would have taken me half an hour takes a minute now.

Tim: I do that a lot with doc strings. I like to add them in theory, but I actually hate writing them. You can go to the top of a function and Copilot can usually figure out fairly accurately a doc string and even generate examples. I’m not writing big libraries used by lots of people—it’s little internal things. How much is it worth spending time on? But when it can do it for you, it’s pretty nice.

Implications for Junior Developers

Tim: You’re in a senior position in your current role but still doing heavy IC work. How are you thinking through these tools in the hands of someone straight out of college? I know that’s been a concern people have had.

Ravi: That’s a great question. This AI stuff, at least at this point where it’s still error-prone and can’t think about large systems cohesively—it’s a huge double-edged sword. Clearly a lot of us are finding huge speed-ups in our programming, but it also makes it really easy to not look at your code at all, to just generate something, run it, and if it works, move on.

This is particularly dangerous for people who don’t have the experience of understanding edge cases where things may fail, or understanding inefficiencies in the code.

I mentioned earlier that I always code review anything it generates. I do a little mini PR. I go through it. Even with Claude, which is one of the more advanced LLMs right now, almost always it’s going to do something where I’m like, “Hey, take a look at that.”

With more junior employees or people straight out of college who don’t have that experience, code review will still be important. As I start to hire people for whom this is a bigger risk, I’m probably going to try to set up something where they make it clear when something was LLM-generated. During code review we can pay more attention to that.

At this moment in time, we cannot 100% trust LLMs, and we need to embrace that it’s not completely doing our job right now.

Tim: At the same time, there’s really no going back. We can’t pretend people aren’t going to use these tools. Being explicit about it is going to be essential—being open about it and trying to give people good guidance.

We’ve had the same concerns over the years about people copying and pasting code snippets from Stack Overflow that they didn’t understand. I’ve done it many times. In some ways, we’re just speeding up the ability to do that, with some possible other risks. Teams and leaders are going to have to be very open about it. That idea of indicating it in code review is good—teaching people to look over their own code, not just take the generated code for granted.

Ravi: That’s a great point—we’ve been doing this forever with Stack Overflow. We copy and paste code in. It’s always been best practice to understand the code, to even cite where you got it from.

Shadow LLM Use

Tim: There’s an interesting issue of companies being hesitant about this. There are legitimate legal concerns and related issues, but at every company now there’s going to be this shadow LLM use where you can’t talk about it because maybe you’re not allowed to use it. That has to be an incentive for companies to figure it out. They can do everything they want to lock things down, but if people can’t do it on their work computers, they’re pulling up their phones and using ChatGPT.

Ravi: Absolutely. It’s so interesting to see how different companies are approaching this. Some are fully embracing it—internal tools that make it easy to use LLMs and ensure the code’s not getting shared back inappropriately. But definitely there are companies just trying to ban it, and I think that’s a huge mistake. You’re going to make your employees less productive, plus they’re just going to go off and do it on the side anyway.

Tim: I know for a fact that’s happening. I talk to a lot of people doing that kind of thing. There’s basically no way to stop it unless you’re putting people in an air-gapped lab. If they have their phones, they have access to these tools. People are using Slack to copy and paste something and then open it up on their phone.

I work at a bank where we have particular data privacy risks, but I’ve been very impressed—they’ve been making an effort. We explored Amazon’s CodeWhisperer offering, did a company-blessed trial, but it ended up not being very good. Now we’re doing a trial with Copilot, and I’ve been able to be on it. I love having access to it at work. I get a lot of speed-up from it, and just the ability to free up brain cycles is really nice.

Imagining going to another company with total prohibition just feels crazy to me. I assume companies are going to start seeing pushback in hiring—people are going to ask, “Can we use these tools?”

Ravi: As you said, there’s no going back. Increasingly it’s just going to become a normal part of the dev experience. It would be the equivalent of saying, “Hey, you can’t use an IDE to program.”

Where LLMs Fall Short

Tim: Have you come across patterns or particular problems where the tools have fallen short? Things you know you aren’t going to be able to get answers for?

Ravi: All the time. I don’t think the current best-in-class LLMs are capable of reasoning at a system level very well. As the codebase I share gets larger and more complex, as I try to fill the context with that, it starts losing track of the bigger picture. If you think about what an LLM is doing, they’re not really designed to think at that system level. They’re still really good at thinking in terms of paragraphs instead of books, for example.

I think this is changing quickly. I’m really curious about how chain of thought changes this. I’m sure there’s a near-term future where LLMs can actually run code and see what happens. But one big limitation right now is that system level. I’d like to think in terms of abstractions and go up levels. I’m really excited for the day where I can think in terms of classes working together instead of going into the class and working on that.

Tim: We probably should have opened this by explaining what models we’re on now for people listening in the future.

Ravi: Yeah, Claude 3.5 and ChatGPT-4 and O1.

Tim: The thing that’s been interesting to me recently—I don’t get to use Cursor in a work context, only in personal projects—but Cursor has this Composer element where you’re giving it free rein to understand the whole repository and make changes across multiple files.

I have personal website projects I’d like to do, but I don’t have front-end knowledge and very limited JavaScript knowledge. I’ve been trying to hack around with what people are doing—bootstrapping an entire project in Cursor and having it generate and update stuff. Every time I’ve done it, I’ve hit a wall where it starts cycling through wrong solutions to something.

Ravi: Yeah.

Tim: I haven’t spent much time on it—literally hours. But I’ve gotten way further than I ever would in my free time trying to do it by hand. I spent many Christmas breaks over the years trying to teach myself enough JavaScript to get something running. I’ve gotten to where I actually have a functional React app, for example. But then interacting with data storage layers—Supabase, SQLite cloud, these pseudo-serverless free-tier cloud storage things—it ends up falling on its face.

The point is that it’s trying to look at the bigger picture in a way that Copilot has been able to do for a bit, but Cursor is trying to take that to another level. Even if it isn’t perfect at writing new code, just the ability to open a repository and have an LLM understand it and explain it to you—how many of us have spent so many hours coming into a company, trying to understand the codebase? I think it’s going to be really valuable to just open it up and ask your robot, “What is this codebase? What parts are important? How do I even run this?”

Ravi: I love it for that. One thing I’ll often do in a Claude session is load it up with multiple parts of my code and give a brief description: “I’m building a recommender system, here’s some context about it.” It can often understand the individual pieces and how they roughly go together. I can ask it to give me code to run something or explain how a functionality works.

But there is a major danger zone in current LLMs, and that’s the context limit. What I call it is the LLM goes off the rails. Its error rate or intelligence starts going down at some point. It’s important to understand the limitations. LLMs do struggle with larger and larger contexts. You load a lot of code, and if you load too much, it starts struggling.

Also, as you talk to it more and iterate more, it starts struggling. I ran an experiment a couple of days ago where I had a function I was trying to speed up. I could already see three places I could optimize it—almost like an interview question. I asked, “Tell me some ways I can speed it up.” It started rewriting it. I could run the code and see the time—maybe it starts at 200 milliseconds, gets to 180, 150. It’s actually doing something.

Then I keep telling it, “Can you make it faster?” And it’s always like, “Of course!” But then you see the time actually start going up—250, 300 milliseconds. There’s a certain overconfidence in these LLMs. It’s like, “Yeah, I can do whatever you ask me.”

Tim: That’s an area where it requires not checking out. Your brain has to keep working. It’s going to be interesting, and we’re at such an early stage.

Ravi: It’ll be interesting to listen to this in a few years and just be like, “Wow, I can’t believe how basic it was.”

Concerns About Skill Atrophy

Tim: Do you have any concerns about losing particular skills? You’re thinking about generating code at the function level—are you worried it’s making you more rusty? Or are you comfortable with the knowledge and experience you have being able to dive in when you need to?

Ravi: That’s a good question. I feel like I should be worried about it. But honestly, because I don’t think algorithms are getting worse, there are certain things that I’m slowly deciding I no longer need to know. I now trust it to abstract some things away from me.

A concrete example is data visualization. When I moved from R to Python, ggplot was torn away from me. I loved ggplot—it was a grammar of graphics. Once you know the language, everything is intuitive. I never found that with Python. With Python, data visualization was always “look at the docs, read the docs, understand it” with any of the popular libraries.

Now I find LLMs are perfectly capable of taking a description of what I want—something I would have mapped into ggplot language—and building the code. I’m okay with being rusty at putting a visualization into code. Increasingly, these small three or four line pieces of code, I’m going to trust them. I can go back and read it and understand what I was doing, but I’m okay not knowing that syntax. I’m okay just looking at the visualization and saying “cool, it did what I wanted.”

Tim: The flip side is that you also have the ability to ask the LLM to go back and explain what you did.

Ravi: That’s a great point.

Tim: I’ve done that with SQL queries. It doesn’t always get it perfectly, but I deal with a lot of SQL in my current role and it can be hard to understand sometimes. Just getting the assistant to paste something in and say “can you add comments to this, help me explain it”—without changing anything—is really powerful.

Again, it’s really hard to know how this is going to impact us in the long run. We have the great benefit of having learned the foundations. To younger listeners, I would encourage you to build your foundations and don’t punt too much on what you know. But at the same time, we’ve got these assets that are huge, and don’t ignore that either.

Ravi: I completely agree. I don’t want to sound like an old dude lecturing young people, but there’s a real danger that you get mid-level in your career and it turns out you don’t really know what you’re doing. I think that wasn’t really possible when we were starting off. That’s becoming increasingly possible. You can ship error-prone code. You can not actually be able to explain your code. There are so many danger points.

The Foundation of Math and Learning

Tim: My background was in math. There’s been a lot of discussion recently about LLMs for math, and Terry Tao, one of the greatest living mathematicians, has actually been digging into LLMs and is very optimistic about their benefits for math.

I imagine having studied math—it’s in many ways the extension of your teacher in seventh grade telling you not to use the calculator for everything. Those hours spent toiling away in the library struggling with proofs, where now an undergrad math curriculum… an LLM could probably solve almost every proof you have set before you. Learning how to reason through that was what made me successful in math, but I see that as a huge benefit to my whole career—just the way I learned how to solve math problems.

I don’t really know what the solution is, but I’m glad I didn’t have what could be a crutch. At the same time, if I were a professional mathematician today, I would absolutely be looking at LLMs—if nothing else, to help with my LaTeX typesetting. But even problem solving. There’s no reason you wouldn’t want to use that to advance problem solving. It’s an interesting balance we’re going to have to figure out.

Have you looked at any of the AI code review tools yet?

Ravi: I probably should. What’s out there?

Tim: I haven’t used them either. I’ve only heard people talk about them. There are integrations with GitHub that go in and add comments. I don’t know how you tune those, but I think it’ll be interesting in the future.

One thing I’m really excited about—hopefully going to have another episode soon with a mutual friend of ours who’s running a small company and has been able to do way more because of LLM-assisted development. I think he’s only himself and one other engineer.

I’m really interested in how this is going to enable entrepreneurs and small teams to develop things. Code review is really valuable, and if an LLM is good enough to provide useful feedback as an individual developer—you don’t always need code review, but it’s easy to miss things.

Ravi: One of the nice things about code review is that with code generation, if it generates wrong code with edge cases, that can be really dangerous. But with code review, the false positives are not a big deal. If it says “this function doesn’t look right” but it’s actually right, maybe you wasted 30 seconds. I can imagine it will be much more thorough than the typical PR review I’m used to—“looks good to me” or “hey, why’d you do it this way?”

Tim: And the possibility of being more thorough. People used to put in things like McCabe complexity checks—very formulaic. The ability for something to think with more context and say “maybe this is too complex and we could split it up” would be really cool. The next step beyond that is maybe automated suggested changes—open a PR into your PR with recommended changes.

Ravi: Just click through the changes. It’s interesting to see how we’re slowly iterating towards more of our code being written and reasoned about by AI. We’re clearly not there, but I think we’ll keep seeing iteratively the LLMs being able to take over more of it. I’m very curious if there’s an endpoint, or maybe 10 years from now they’re just doing our jobs for us. But we’re just in the early stages where it’s a huge boost to our productivity, and we’re still ultimately in charge of our code.

Looking Forward

Tim: I was going to conclude by asking if you had any forecasts. Any hope of 10 years seems like an insane time window, but two-year or five-year predictions?

Ravi: I think there’s probably a ton of low-hanging fruit specifically in our space. One thing I mentioned earlier is LLMs actually running code. I’m sure that’s being done. But I think as we treat LLMs more as an agent in a codebase, even the models that exist now can probably do more.

Clearly the models are getting better. We’re throwing a ton of money into making these models reason better. It seems like the more computation we throw at it, the better it gets. I’m sure we’ll hit limits with the current technology—I’m deeply skeptical that we’re going to hit general intelligence with what we have now. But for the last two years, we’ve seen consistent, huge jumps in performance. I believe very much in momentum. I think we’re on the path towards LLMs owning more of a codebase. We’re not there yet, but I think we’re moving in that direction.

Tim: It’ll be fun to see. It’s very interesting, and very hard to make predictions. There are a lot of crazy predictions out there.

Ravi: Again, I don’t want to come back and listen to this and be like, “Wow, I was so wrong.”

Tim: Could we have even imagined this? You’ve been doing machine learning a few years longer than me, but in 2010, could we have imagined this? It seems wild. A “language model” back then was a Markov chain model. They were so dumb. You’re thinking about language in terms of bag of words. Even 10 years ago, I was working on really advanced Bayesian models for language, and they had such a simplistic understanding of the language.

Just the ability to generate text that even looks sensible, much less actually is sensible, is wild. It’s going to be really cool to see what that turns into.

Ravi: Even just five years ago, GPT-2 was released almost exactly five years ago. Everybody thought it was kind of a toy or a joke. That was actually very much the same technology—it’s wild to see what five years of just pushing on the same concepts has gotten us, let alone 10 years ago when it was just a joke.

Tim: Very cool. All right, well if you don’t have anything else to add, we’ll wrap up there. This has been a great discussion. I appreciate your perspective.

Ravi: Likewise. Thanks for having me.