However when trying cc it left me vey disappointed. For context I’m working on a relatively greenfield rust project and gave it tasks that I would consider appropriate for a junior level colleague like:
- change the return type of a trait and all it’s impls
- refactor duplicate code into a helper function
- replace some of our code with an external crate
it didn’t get any of them correct and took a very long time. Am I using the tool wrong?
How are you using cc or other agentic tools?
I've been having fun with Claude Code and VSCode's agent. Any reasonably experienced engineer should be able to use it for a subset of languages without too many issues, but they definitely need to hydrate the context (eg. using Claude.md) and have a sensible set of system prompts set up. Good, well-written and broken-down-into-steps user prompts are non-negotiable.
I think you are using the wrong language to be honest. LLMs are best at languages like Python, Javascript and Go. Relatively simple structures and huge amounts of reference code. Rust is a less common language which is much harder to write.
Did you give claude code tests and the ability to compile in a loop? It's pretty good in go at least at debugging and fixing issues when allowed to loop.
Current LLM's at least a reasonable percentage of the time still get stuck on race conditions and bugs not obvious via static analysis. If you can explain the exact source of a bug to an LLM they can get it, but if there's a seemingly obvious solution that isn't the correct one, they will try to fix things the wrong way.
It's best to use AI in areas where a lack of specificity or precision isn't a major hinderance, and all abstraction is a closed loop that won't hurt you in the future due to not knowing how it works.
What helped me was shifting how I use it. I don’t treat it like a junior dev anymore, I treat it more like a second brain. For example:
I use Claude Code to explore options before I commit to a design. I’ll ask it “what are 3 ways to abstract this logic?” and sometimes that alone gives me a better direction.
It’s pretty good at turning rough notes or comments into starter code or test cases. That saves time on boilerplate.
If I feed it a clean, self-contained chunk of code and ask for a targeted change (e.g., “convert this to async”), it often nails it. But yeah, across a codebase, not so much.
Had less luck on generating new features. It's great for prototyping UI but I routinely end up writing it myself.
It's also quick to forget how I like to do things or what libraries and packages it should use. So I either have to keep reminding it or fix up the work myself. While I'm unsure whether it still ends up being quicker, that's really immaterial for me because it absolutely kills the enjoyment of the work.
I now have 5-10 small services running, whatever "thing" I think I need I create it and self host it.
It's such a revolution.
Have been using it to build a DSL in JS. Greenfield. I’ve followed the commonly touted “plan, act, evaluate” approach; I’ve got it to generate a clear project vision, scope, and feature checklist. Then told it to refer to that for context. I’ve been descriptive and explicit in my prompting, way more so than previously.
It has gotten the broad strokes right, I’ve got an exceptionally barebones DSL, made up of 5 entities, working…just.
It has now started to spin its wheels on small issues and can’t fix them without breaking something else. The codebase isn’t even big (~8 main functions across a few files). Troubleshooting the code is difficult because it’s convoluted and I lack the same intuition for it I would have had I written it myself. I’ve decided to rewrite everything with less control ceded to the LLM.
When it works, it feels great. When it doesn’t, which is often, the spell is broken and I feel I’ve wasted a bunch of time and have not much to show for it.
I think we have to build up enough code for it to start appearing like brownfield, before Claude knows how to engineer correctly. Which kind of makes sense if we view Claude Code as a junior engineer with infinite stamina.
I also actually like to spin up Claude Code and Gemini in parallel to see what each one comes up with. Gemini will often do the simpler approach, but not often fully featured, and my solution often ends up taking the 2 solutions and refining in Cursor to come up with the final solution.
It’s useful as an built-in quick docs / search that can spit out small code fragments.
Every time I gave it more space results were disappointing.
What language are you using?
> I’m working on a relatively greenfield rust project
I haven't had good luck using LLMs with Rust, but it may just be me.