Google Bard explains ARM assembly (badly)
While investigating whether Google Bard might (falsely) accuse me of snitching on the mob like Bing did, Bard suggested “explain code” was something it could do. As someone who has spent a lot of time staring at disassembly wishing it came with an explanation, I thought this sounded great1 and asked it to explain a bit of hand written assembly from the CHDK project.
I picked this code because it’s fairly straightforward and short enough to be manageable, but not totally predictable boilerplate. If I encountered it in reverse engineering, a couple things might puzzle me for a few minutes, but it’s easily tractable.
If line-by-line explanation of why a chatbot’s line-by-line explanation of assembly code is bad ain’t your thing, here’s the tl;dr. The main discussion assumes familiarity with assembly concepts, though I suspect much of the wrongness will shine through regardless.
Notation
Bard’s “explanatory” text, including headings and code blocks, is in block quotes, like this:
Sure. Here is a step-by-step explanation of the assembly language code you provided:
My comments follow Bard’s, in regular text like this, with bold for emphasis and CODE
for mnemonics, register names etc., and footnotes for asides or additional detail2.
Spoilers
Before getting into Bard’s “explanation”, here’s a quick explanation of what the code actually does.
The assembly file used implements a function with the C signature
unsigned call_func_ptr(void *func, const unsigned *args, unsigned n_args)
It calls the function at the address func
with the n_args
arguments contained in the array *args
3. In pseudo-code, something like:
call_func_ptr(func, args, n_args):
return func(args[0], args[1], ... args[n_args-1])
with arbitrary args
and n_args
provided at run time. It’s in assembly because this can’t be expressed straightforwardly in standard C.4
The assembly implementing this amounts a bit over 20 instructions.
ARM odds and ends
To make the implementation of the above function easier to follow, here’s a quick overview of the typical ARM calling convention used on this platform:
- The first four arguments are passed in the registers
R0
throughR3
- Additional arguments are pushed on the stack such that
SP
points to 5th argument on entry to the called function - The stack is full (
SP
points a value, not the next free slot), descending (push decrementsSP
by 4) - Return value is in
R0
- For our purposes, everything is 32 bit words. Smaller arguments are implicitly promoted, larger are not considered
- Functions are free to modify
R0
throughR3
andR12
, others must be preserved - Return address is in
LR
The test
Bard’s complete “explanation” with the original prompts and without my analysis interspersed is here. As noted about 3/4 down that page, the initial Bard response appeared to cut off abruptly and I used a second prompt to get more output.
A copy of the original code used is here. This includes the comments I stripped out before pasting to the Bard prompt.
OK Bard, roll for wisdom check
Without further ado, here’s Bard’s “explanation”, and my explanation of why Bard’s explanation is crap.
Text section:
.text
This line tells the assembler that the following code is part of the text section, which contains executable instructions.
Global symbol:
.global call_func_ptr
This line tells the assembler to make the symbol
call_func_ptr
globally visible. This means that other code modules can call the function at this address.
Function definition:
call_func_ptr:
This line defines the start of the function
call_func_ptr
.
Stack saving:
PUSH {R4-R5,LR}
This instruction pushes the registers
R4
throughR5
andLR
onto the stack. This is done to preserve the values of these registers while the function is running.
All of the above is broadly correct, and mostly trivial.
Register setup:
MOV R12, R0 MOV R4, SP
These instructions move the values of the
R0
andSP
registers to theR12
andR4
registers, respectively.
Correct, though not terribly informative.
A more useful explanation would note R0
is the first argument to call_func_ptr
, and that R12
is not preserved across calls, so this MOV
isn’t simply saving R0
. In other words, paying attention to where R12
crops up later will tell you something about the function and its arguments. OTOH, R4
is preserved, so SP
is likely just being saved to make it easy to clean up later stack manipulation.
Stack setup:
ADD R5, R1, R2,LSL #2 RSBS R0, R2, #4 BLO stack_setup ADD PC, PC, R0,LSL #2 NOP
Here Bard goes off the rails. For starters, the section heading is misleading: The code conditionally branches to a label named stack_setup
but otherwise has nothing to do with the stack.
These instructions perform the following steps:
Bard commented on this entire block with bulleted list. For easier reference, I’ve reproduced the individual instruction under discussion above each bullet point.
ADD R5, R1, R2,LSL #2
1. Add the values of the
R1
andR2
registers, shifted left by 2 bits, to theR5
register. This calculates the address of the function’s stack frame.
The first sentence is only slightly wrong: The result is stored in R5
, not added to R5
, which is an important distinction because R5
is not set in the preceding code of this function and would contain unknown random junk. Read literally, the wording also suggests that both R1
and R2
are shifted, when in fact only R2
is.
It’s also uninformative: A human analyzing this function would note that R1
and R2
haven’t been modified in the code so far, so they are the second and third arguments to call_func_ptr
. In other words, R5 = args + 4*n_args
, i.e. the end of the args
array, with the shift to multiply by 4 because the elements are 32 bit words.5
As for the second sentence, it’s just wrong. As noted above, R1
and R2
are both function arguments and nothing in the code connects them to the stack, so the claim this “calculates the address of the function’s stack frame” is pure fiction.
Throughout its “explanation”, Bard refers to “the function”, which is ambiguous in that it could be read to refer to the function call_func_ptr
being explained, or the func
argument to it. References to “stack frame” are similarly ambiguous: I wouldn’t call args
a stack frame, but to the extent one can credit Bard’s “explanation” with referring to anything that exists in the code at all, it sometimes seems to.
RSBS R0, R2, #4
2. Subtract 4 from the
R2
register. This is because the function’s stack frame will need to store 4 return addresses.
The first part is wrong: RSB
is a reverse subtract, so this is R0 = 4 - n_args
. The S indicates flags should be set, so that subsequent instructions can be conditional on things like whether the result was zero, negative, etc.
The rest is just gobbledygook: There is nothing involving a return address, never mind four of them(?!), and nothing in the code to suggest there might be. The entire “because” sentence is pure invention. The “explanation” isn’t even self consistent: Given stack elements and addresses are 32 bit, an offset of 4 would only accommodate one.
BLO stack_setup
3. Branch to the
stack_setup
label if theR2
register is less than zero. This means that the stack frame is too large and the function needs to return an error.
The first sentence is wrong: The branch is taken if R0
(the result of 4 - n_args
) was negative, i.e. there were more than four arguments.
The rest is nonsense, it has nothing to do any stack frame being too large, and there is no error involved. It’s worth noting that, again, nothing in the source would suggest Bard’s “explanation” here to a human reader, it’s just completely made up.
ADD PC, PC, R0,LSL #2
4. Add the value of the
R0
register, shifted left by 2 bits, to the program counter (PC). This effectively branches to the return address at the top of the stack frame.
The first part is correct, though given that adding to PC
is obviously calculating an address, “multiplied by 4” would be more informative than “shifted left by 2”.
The second part is just nonsense. R0
is the 4 - n_args
value calculated above, and execution reaches this point only if that value is >= 0. So the result will be an offset of between 0 and 4 instructions within the current function, certainly not the return address or anything on the stack.
Once again, there is nothing in the code which supports Bard’s “explanation”, mainly because the “explanation” is incoherent. The instruction jumps to the current execution address PC
plus some offset based on the value in R0
. The only way this could be the “return address at the top of the stack frame” is if you contrived to set the value of R0
such that the math adds up to whatever that value is, and nothing the code remotely suggests that.
NOP
5. Insert a NOP instruction. This is necessary because the branch instruction in the previous step takes up two bytes, but the PC must be aligned on a four-byte boundary.
This is completely wrong. The code is classic ARM (as opposed to Thumb, Thumb2 or AArch64)6, so all instructions are exactly four bytes, aligned by four. The NOP
is present because of ARM’s somewhat baroque rules for arithmetic operations involving PC
, which in this case ends up being the address of the instruction +87.
But all of that is just a warmup, because Bard’s supposedly line-by-line “explanation” simply ignored a section of code following the NOP
, reproduced (with original comments) below, which provides the actual reason for the ADD
:
LDR R3, [R5,#-4]! // fourth arg
LDR R2, [R5,#-4]! // third arg
LDR R1, [R5,#-4]! // second arg
LDR R0, [R5,#-4]! // first arg
B do_call // regs done
stack_setup:
SUB R2, R2, #4 // number of stack args
SUB R2, R5, R2,LSL #2 // last = src - (n_stack_args)*4
As the comments explain, the jump calculated by ADD PC,...
above is used to set only required registers8, in the case that there are four or fewer arguments. Any reasonably knowledgable human analyzing this code would group the LDR Rn
lines with the ADD PC,...
code, since together they form a single logical construct similar to a C switch
.
We also find the stack_setup
label which Bard spuriously ascribed to error handling earlier, but actually handles the case where there are more than four arguments, requiring some to go on the stack. This would more logically be grouped with the immediately following code, since the branch to do_call
above is unconditional, and execution can only arrive at stack_setup
from the earlier BLO
.
Back to Bard’s “explanation”:
Stack setup loop:
LDR R0, [R5,#-4]! STR R0, [SP,#-4]! CMP R5, R2 BNE stack_setup_loop
This loop copies the caller’s stack frame to the function’s stack frame. The
LDR
instruction loads a value from the stack frame at the addressR5
and decrementsR5
. TheSTR
instruction stores the value fromR0
to the stack frame at the addressSP
and decrementsSP
. TheCMP
instruction compares the values ofR5
andR2
. IfR5
is greater thanR2
, the loop branches back to the beginning.
The first bit is wrong, it’s copying from the address contained in R5
, which is the end the args
array, not “the callers stack frame”. The rest is generally correct, aside from the dubious “stack frame” references. The end condition involving R2
is not fully explained either, because the value of R2
was calculated in the stack_setup
code Bard ignored, but is in the execution path which leads to this loop.
Register restoration:
LDMIA R1, {R0-R3}
This instruction loads the registers
R0
throughR3
from the stack frame at the addressR1
and incrementsR1
.
Aside from questionable “stack frame” and “restoration” this is mostly right, but uninformative. A more useful explanation would note R1
here is the original args
argument, so this simply copies the first four words, in order, from the start of the array, to R0
through R3
. This handles the four register arguments in the case there were more than four total arguments.
It’s also incorrect to say it increments R1
: The LDMIA
(load multiple increment after) does load four sequential values starting at the address in R1
, but R1
is in the list (R0
through R3
) of registers being loaded to, so the value of R1
is overwritten by the second element of args
, not incremented.9
Function call:
BLX R12
This instruction branches to the function at the address
R12
.
Correct, though it would be more informative if it mentioned R12
was the first argument to the call_func_ptr
function being explained, and BLX
is specifically a call instruction, which saves the return address in LR
.
For a human analyst, this would likely be a key point to understanding the whole function: Once you know the first argument is a function pointer, the purpose of the rest of the register shuffling and memory copying should be fairly obvious.
Stack cleanup:
MOV SP, R4 POP {R4-R5,LR}
These instructions restore the stack pointer (SP) to its original value and pop the registers
R4
throughR5
andLR
from the stack.
Correct.
Function return:
BX LR
This instruction branches to the return address stored in the
LR
register.
Correct.
Pseudo-code:
Here is a pseudo-code representation of the entire function:
call_func_ptr(func_ptr, arg1, arg2, arg3): # Save the calle
This is where Bard’s initial response ended. At first I just assumed this was Bard being terrible and not forming a coherent story, but eventually I noticed the markdown lacked a closing code tag, the total output was suspiciously close to 32K, and the alternative outputs it offered were similarly truncated. So it seems like this is some kind of content length limit, though the UI provided no indication and a quick search didn’t find it explicitly documented.
Anyway, I replied with: “The psuedo-code10 above appears incomplete. Please provide complete pseudo-code for the assembly language function you explained previously”, and Bard continued:
Sure, here is a complete pseudo-code representation of the entire assembly language function:
call_func_ptr(func_ptr, arg1, arg2, arg3):
# Save the caller's registers on the stack
save_registers()
# Set up the function's stack frame
setup_stack_frame(func_ptr, arg1, arg2, arg3)
# Restore the caller's registers
restore_registers()
# Call the function
call_func(func_ptr)
# Clean up the function's stack frame
cleanup_stack_frame()
# Return to the caller
return()
This does capture the fact that call_func_ptr
calls a function pointer11 and does something with the arguments, but the details range from wrong to nonsensical. In particular, it claims that call_func_ptr
takes four arguments, when it actually takes three12, and it implies the arguments after func_ptr
are passed to func_ptr
, rather than specifying an array which in turn contains an arbitrary number of actual arguments.
The suggestion that setup_stack_frame
needs the address of the function to be called seems nonsensical, and has no support in the line-by-line description or any relation to the actual code. It’s also noteworthy that the made-up error handling in the line-by-line’s “stack setup” section is not mentioned.
Finally, it doesn’t include any control flow. Understanding the execution paths would be a primary focus for a human analyst, but Bard’s explanation just breezes through it sequentially here, and except for the loop, mostly ignores it or gets it wildly wrong in the line-by-line.
Bard goes on to “explain” the pseudo-code:
Here is a more detailed explanation of each step:
Save the caller’s registers on the stack:
The
save_registers()
function saves the registersR4
throughR5
andLR
onto the stack. This is done to preserve the values of these registers while the function is running.
This essentially reiterates the line-by-line. Fair enough.
Set up the function’s stack frame:
The
setup_stack_frame()
function performs the following steps:
- Calculates the address of the function’s stack frame by adding the values of the
R1
andR2
registers, shifted left by 2 bits.- Subtracts 4 from the
R2
register to account for the space needed for the return address and the four function arguments.- Copies the caller’s stack frame to the function’s stack frame.
- Stores the function pointer in the first slot of the stack frame.
- Stores the values of the
arg1
,arg2
, andarg3
registers in the next three slots of the stack frame.
This is wrong, including some nonsense from the line-by-line (in #1 and at least partially #2), and also some new nonsense that didn’t appear in the line by line.
In particular #4 and #5 appear to be new, and also completely made up. On #4, the function pointer is passed in using R0
and moved to R12
for the call, never stored to memory in any form. Similarly for #5, none of the other arguments to call_func_ptr
are stored to memory.
It also omits some nonsense that appears in the line-by-line, such as “This means that the stack frame is too large and the function needs to return an error.” and “This effectively branches to the return address at the top of the stack frame.”
Restore the caller’s registers:
The
restore_registers()
function restores the registersR0
throughR3
from the stack frame at the addressR1
and incrementsR1
.
This is wrong, for the reasons explained in the line-by-line. The corresponding code is loading the registers for func_ptr
, and it would make no sense to restore callers registers before calling func_ptr
.
Call the function:
The
call_func()
function branches to the function at the addressR12
.Clean up the function’s stack frame:
The
cleanup_stack_frame()
function restores the stack pointer (SP) to its original value and pops the registersR4
throughR5
andLR
from the stack.Return to the caller:
The
return()
function branches to the return address stored in theLR
register.
These all generally reflect the line-by-line and are reasonable.
I hope this more detailed explanation is helpful. Please let me know if you have any other questions.
Sure thing, dude.
What’s the moral of this story?
Reverse engineers, rejoice! Bard ain’t taking your job any time soon, though in less good news, you may need to battle an army of marketing droids and middle managers who think it can.
OK, so Bard did a catastrophically bad job of explaining some random assembly. Did I really expect an LLM to usefully explain non-trivial assembly code or is it just a case of play stupid games, win stupid prizes?
I admit I had low expectations, and picked code I expected to be challenging, but OTOH Google promotes “explain code” as a thing Bard can do, and this is ultimately fairly straightforward code13. I have also seen people suggest LLM tools like this could be used for reverse engineering, so a concrete example to back up my inevitable “oh FFS no” may have some value.
Beyond that, there’s nothing magic about assembly14, it’s just code and it seems safe to assume that even if Bard can on average do a better job “explaining” some other sorts of code, the underlying failure modes still apply.
How about them failure modes, anyway?
-
It’s wrong. A lot. Both subtly and outrageously, about minor details and fundamental characteristics.
-
Even when right, the explanation is not very good. Almost everything it gets completely correct, you could find by looking up the instruction in a manual. A non-LLM tool which had all the instruction descriptions in a database and displayed them on demand would far superior for this task15, since Bard sometimes gets the basics wrong too.
Some of my “would be more informative” comments may be nitpicks, but the point is that merely paraphrasing the formal instruction definition is an extremely low value explanation. A tool that described where values involved in a particular operation came from would have substantially more value.
- It fails spectacularly at the bigger picture. The real difficulty of understanding assembly is not about what individual instructions do (which again, can usually be looked up), but keeping track of how they fit together into larger scale logic16. The fact that it entirely butchers or ignores control flow makes it completely unsuitable for this. At a slightly more granular level, many of Bard’s explanations are two sentences where the first describes the instruction roughly as a manual might, and the second purports to explain why, like:
Add the values of the
R1
andR2
registers, shifted left by 2 bits, to theR5
register. This calculates the address of the function’s stack frame.which would be a helpful approach except that, as here, the second part is almost always complete bullshit.
The pseudo-code similarly purports to give you the big picture, but in fact neither describes the actual underlying logic nor the line-by-line “explanation”.
-
The stuff that’s wrong often looks superficially plausible. The explanation for the
NOP
is completely wrong in the specific case, but is generally an explanation that could be correct in others.To identify when it’s wrong, you frequently need to understand the thing it’s wrong about. Which seems sub-optimal for a system that purports to explain things to people who presumably want an explanation because they don’t currently understand the thing.
-
LLMs, including Bard, fundamentally don’t understand things, and it shows. Being generous the pseudo-code almost kinda-sorta reflects what the function does, and also kind-sorta matches the line-by-line “explanation”, but in the specifics the entire thing is wrong, incoherent and self contradictory.
There is no chance you could implement the original function from Bard’s description, and if you tried to implement what it describes, after stripping out the incoherent bits you’d most likely end up with something like:
call_func_ptr(func, a, b, c): func(a, b, c)
which would be trivial, quite pointless, and also would not require almost any of the code implied by Bard’s description.17
More generally, this provides yet another example of one of the pernicious failure modes of LLMs as information sources: They are often fairly good at getting well known stuff approximately right, but transition seamlessly into utter bullshit as soon as you move to anything more obscure. If you only know a little assembly, you might well see the first few lines check out and figure the rest must be just as good. It ain’t.
Closing thoughts
None of this will surprise those who view LLMs as Frankfurtian bullshit machines or spicy autocomplete.18
One common reaction I see to discussion of LLM failures like this is “well, humans get stuff wrong too” which is true, as far as it goes: Humans do regularly make errors.19 But the kind of errors humans make largely do not resemble the kind Bard makes here. Sure, you might get the occasional over-confident newbie or BSer on Stack Overflow20, but generally speaking people explaining code to each other do not just make stuff up out of thin air, and do make some effort to communicate how confident or not they are about their understanding.
The LLM providers tend to address concerns about errors by plastering their products with disclaimers. In the case of Bard, “Use code with caution” appears on every code block and “Bard may display inaccurate info, including about people, so double-check its responses” appears at the bottom of the page, but it’s hard to see this as anything other than CYA when they simultaneously hype how the product is about to revolutionize everything.21
All of which brings me back to the question that’s bugged me since these things first appeared: What is supposed to be the value of a system which provides “information” that requires you to go through and use other sources to determine whether each significant element is bullshit?
Bonus comedy
-
One of the transcript options in Bard is to export to a google doc. Which gave me this:
The cyberpunk assembly code leaking out over the left side turns out to be because Bard used the entire prompt, complete with newlines and spaces, as the filename. This overflows the docs UI and is unclickable and blocks the underlying UI unless you edit the document name.
-
This page will presumably be crawled and ingested for future LLM training, leading to Bard’s BS explanations of ARM assembly being picked up by future models. I for one welcome our Habsburg AI22 overlords!
-
For suitably broad values of “great”, inclusive of the sense that “web3 is going great”. ↩
-
and snark. ↩
-
Wait, why would you do this, instead of just calling
func
directly? Primarily to allow calling arbitrary firmware code from Lua scripts. In this house, arbitrary code execution is a feature, not a bug. ↩ -
The stdarg.h va_ macros do roughly the opposite of this, taking variable arguments and turning them into a
va_list
, but functions like vsprintf which accept ava_list
must be explicitly written to do so. There are of course other ways to achieve the desired result, for example, you could just pick an arbitrary large number of arguments and pass dummies, because, like most C calling conventions, arguments beyond the expected number will just be ignored. ↩ -
Of course, the arguments aren’t named in the code provided to Bard, but
R5 = arg1 + arg2*4
is still far more informative. ↩ -
In fairness, I didn’t tell Bard what architecture or dialect the code is, but it should be recognizable to someone trained on the entire internet, and if the dialect is unknown, there would certainly be no justification for assuming the branch was two bytes or that instructions needed to be four byte aligned. In fact, outside of a few special cases, ARM dialects that include 16 bit instructions are generally two byte aligned. ↩
-
Since it would never be executed, it doesn’t actually need to be a
NOP
. ↩ -
In fact, setting all four would be harmless, since any of
R0
throughR3
not used as arguments are ignored, but without other changes, it would risk running off the end ofargs
, though if you want to get sporty, that too would probably also be OK on the platform in question. ↩ -
The writeback form
LDMIA Rn!, ...
, would increment theR1
, but isn’t present in this code and makes no sense if the register is also in the list to be loaded. ↩ -
[sic] Yeah, that’s a typo and I’m not gonna re-do it. ↩
-
It’s unclear whether Bard got this from the name or the code or the combination. For a test more representative of reverse engineering, I should have changed all the labels to something like
loc_xxxxxxxx
as disassemblers do. ↩ -
A human analyzing this code can determine that it only takes three arguments with 100% certainty, because arguments go in
R0-R3
and the only places whereR3
is referenced overwrite it with values fromargs
. ↩ -
It arguably should much easier to explain than most code, given that it consists of a very small number of simple, well defined instructions, rather than, say, a bunch of calls to complicated APIs of uncertain versions interacting with unpredictable users and networks. ↩
-
In this respect, at least, assembly language differs from friendship. ↩
-
Producing such a tool might get you in trouble with the copyright owners of the manuals, unless perhaps you happen to be a multi-billion dollar company and obfuscate the content in a big pile of linear algebra. ↩
-
For this author anyway: My eyes tend to glaze over at stretches of code that juggle more than a few values in a not totally obvious way. YMMV. ↩
-
Obviously, it would be trivial to write in C, and in ARM, it could be written
MOV R12, R0 MOV R0, R1 MOV R1, R2 MOV R2, R3 BX R12
If you’re wondering, wait shouldn’t that be
BLX
? No, this is a tail call: When the function inR12
returns it will use the originalLR
to return directly to the caller of this function ↩ -
I unfortunately don’t know who to credit this excellent coinage to. Hit me up if you have the source. ↩
-
I’m sure my description of code in this post contains a few, feel free to point them out on mastodon. ↩
-
Though rumor has it the over-confident BSers you encounter these days are likely to be using an LLM to boost their clout by posting more answers about things they don’t actually understand. ↩
-
Not infrequently adding a caveat along the lines of “if it doesn’t go full skynet and kill us all first”, which always struck me as an odd sales pitch. ↩
-
https://twitter.com/jathansadowski/status/1625245803211272194 ↩