Coincidentally, I have been implementing an ad pacing system recently, with the help of Anthropic Opus and Sonnet, based on PID controller
Opus recommended that I should use a PID controller -- I have no prior experience with PID controllers. I wrote a spec based on those recommendations, and asked Claude Code to verify and modify the spec, create the implementation and also substantial amount of unit and integration tests.
I was initially impressed.
Then I iterated on ihe implementation, deploying it to production and later giving Claude Code access to log of production measurements as JSON when showing some test ads, and some guidance of the issues I was seeing.
The basic PID controller implementation was fine, but there were several problems with the solution:
- The PID controller state was not persisted, as it was adjusted using a management command, adjustments were not actually applied
- The implementation was assuming that the data collected was for each impression, whereas the data was collected using counters
- It was calculating rate of impressions partly using hard-coded values, instead of using a provided function that was calculating the rate using timestamps
- There was a single PID controller for each ad, instead of ad+slot combination, and this was causing the values to fluctuate
- The code was mixing the setpoint/measured value (viewing rate) and output value (weight), meaning it did not really "understand" what the PID controller was used for
- One requirement was to show a default ad to take extra capacity, but it was never able to calculate the required capacity properly, causing the default ad to take too much of the capacity.
None of these were identified by tests nor Claude Code when it was told to inspect the implementation and tests why they did not catch the production issues. It never proposed using different default PID controller parameters.
All fixes Claude Code proposed on the production issues were outside the PID controller, mostly by limiting output values, normalizing values, smoothing them, recognizing "runaway ads" etc.
These solved each production issue with the test ads, but did not really address the underlying problems.
There is lots of literature on tuning PID controllers, and there are also autotuning algorithms with their own limitations. But tuning still seems to be more an art form than exact science.
I don't know what I was expecting from this experiment, and how much could have been improved by better prompting. But to me this is indicative of the limitations of the "intelligence" of Claude Code. It does not appear to really "understand" the implementation.
Solving each issue above required some kind of innovative step. This is typical for me when exploring something I am not too familar with.
Great story. I've had similar experiences. It's a dog walking on its hind legs. We're not impressed at how well it's walking, but that it's doing it at all.
Opus recommended that I should use a PID controller -- I have no prior experience with PID controllers. I wrote a spec based on those recommendations, and asked Claude Code to verify and modify the spec, create the implementation and also substantial amount of unit and integration tests.
I was initially impressed.
Then I iterated on ihe implementation, deploying it to production and later giving Claude Code access to log of production measurements as JSON when showing some test ads, and some guidance of the issues I was seeing.
The basic PID controller implementation was fine, but there were several problems with the solution:
- The PID controller state was not persisted, as it was adjusted using a management command, adjustments were not actually applied
- The implementation was assuming that the data collected was for each impression, whereas the data was collected using counters
- It was calculating rate of impressions partly using hard-coded values, instead of using a provided function that was calculating the rate using timestamps
- There was a single PID controller for each ad, instead of ad+slot combination, and this was causing the values to fluctuate
- The code was mixing the setpoint/measured value (viewing rate) and output value (weight), meaning it did not really "understand" what the PID controller was used for
- One requirement was to show a default ad to take extra capacity, but it was never able to calculate the required capacity properly, causing the default ad to take too much of the capacity.
None of these were identified by tests nor Claude Code when it was told to inspect the implementation and tests why they did not catch the production issues. It never proposed using different default PID controller parameters.
All fixes Claude Code proposed on the production issues were outside the PID controller, mostly by limiting output values, normalizing values, smoothing them, recognizing "runaway ads" etc.
These solved each production issue with the test ads, but did not really address the underlying problems.
There is lots of literature on tuning PID controllers, and there are also autotuning algorithms with their own limitations. But tuning still seems to be more an art form than exact science.
I don't know what I was expecting from this experiment, and how much could have been improved by better prompting. But to me this is indicative of the limitations of the "intelligence" of Claude Code. It does not appear to really "understand" the implementation.
Solving each issue above required some kind of innovative step. This is typical for me when exploring something I am not too familar with.
I learned a lot about ad pacing though.