You’ll never believe it, but I think I actually found something interesting to say about CONVERSION SPECIFIERS, of all things. What is a conversion specifier? That’s the technical term used to describe the escape sequences a programmer uses in printf()-style output strings. For example, %d for printing decimal numbers, as in DEBUG ((EFI_D_INFO, "IndexBlockDevice is %d\n", IndexBlockDevice)).

I run into problems using DEBUG print statements when I assume UEFI uses the same conversion specifiers as the C standard—it does not. This results in things like DEBUG statements only printing every other character—a classic ASCII vs. Unicode issue. This article is an attempt to compare and contrast how conversion specifiers differ between the C and UEFI specifications.

According to the EDKII Coding Standards, the C dialect to be used with EDKII is:

ISO/IEC 9899:199409, or C95, with some elements from C99

Therefore, I’ll compare the conversion specifiers defined in C99 to the conversion specifiers defined in EDKII’s PrintLibInternal.c file.

Defined in C99:

specifier	argument type	output format
d, i	int	decimal
u	unsigned int	decimal
o	unsigned int	octal
x	unsigned int	hex, lowercase
X	unsigned int	hex, uppercase
f	float/double	float, decimal
e, E	float/double	exp. notation, decimal
a, A	float/double	exp. notation, hex
g, G	float/double	float or exp. notation, whichever is shorter
c	char/int	single character
s	string	zero-terminated string
N	int *	# of characters printed
p	pointer	an address in hex
%	(escape character)	prints a “%”

Defined in UEFI:

Conversion specifiers are defined in UEFI in the following location:
UDK2014.SP1.P1\
    MyWorkSpace\MdePkg\
    Library\BasePrintLib\
    PrintLibInternal.c
The code in question can be found in the BasePrintLibSPrintMarker() function, line 528, switch (FormatCharacter). Notice that the technique of falling-through from one case statement to the next is heavily leveraged in this piece of code.

Highlighting the Differences:

These are the differences I see between the C99 standard and the UEFI implementation as defined in PrintLibInternal.c.

case ‘s’, case ‘S’, and case ‘a’

In C99, ‘s’ represents a string, and ‘S’ is not defined. All strings use ‘s’, and the programmer prefixes a string with ‘L’ to signify wide characters, as in L”my string”. Otherwise, the string is assumed to be single-byte.
In UEFI, both ‘s’ and ‘S’ are defined to be exactly the same thing, which is a Unicode string:

case 's':
case 'S':
Flags |= ARGUMENT_UNICODE;

If you use either ‘s’ or ‘S’ and pass to it a string WITHOUT ‘L’, you’ll see the problem where every other character gets skipped in the output. To properly print a single-byte string, the programmer uses the specifier ‘a’, as in DEBUG ((EFI_D_INFO, "say %a\n", “hello, world”)).

case ‘g’ and ‘G’

C99 uses ‘g’ and ‘G’ for floating point numbers. In UEFI, things are very different. ‘G’ is not defined, whereas ‘g’ is a special specifier that signifies the programmer wants to print a value of type EFI_GUID. The UEFI type EFI_GUID is a struct that defines a GUID:

typedef struct {
  UINT32  Data1;
  UINT16  Data2;
  UINT16  Data3;
  UINT8   Data4[8];
} EFI_GUID;

The ‘g’ conversion specifier specially format the EFI_GUID passed to it so that it prints out in a readable fashion, like:

9093e6d2-1b59-4ab8-9e08-3ef4a2108fc9

case ‘t’

The ‘t’ specifier is not defined at all in C99, but is defined in UEFI as “Time’. It is used to print out the current month, day, year, hour, and minute.

case ‘r’

The ‘r’ specifier is another one unique to UEFI. Its job is to print out values of type EFI_STATUS, but in a user friendly way. Type EFI_STATUS is a typedef of INTN—a signed type where the status codes are defined as negative numbers. From PrintLibInternal.c:

"Success",                   //RETURN_SUCCESS              =0
"Warning Unknown Glyph",     //RETURN_WARN_UNKNOWN_GLYPH   =1
"Warning Delete Failure",    //RETURN_WARN_DELETE_FAILURE  =2
"Warning Write Failure",     //RETURN_WARN_WRITE_FAILURE   =3
"Warning Buffer Too Small",  //RETURN_WARN_BUFFER_TOO_SMALL=4
"Load Error",                //RETURN_LOAD_ERROR           =1 |MAX_BIT
"Invalid Parameter",         //RETURN_INVALID_PARAMETER    =2 |MAX_BIT
"Unsupported",               //RETURN_UNSUPPORTED          =3 |MAX_BIT
"Bad Buffer Size",           //RETURN_BAD_BUFFER_SIZE      =4 |MAX_BIT
"Buffer Too Small",          //RETURN_BUFFER_TOO_SMALL,    =5 |MAX_BIT
"Not Ready",                 //RETURN_NOT_READY            =6 |MAX_BIT
"Device Error",              //RETURN_DEVICE_ERROR         =7 |MAX_BIT
"Write Protected",           //RETURN_WRITE_PROTECTED      =8 |MAX_BIT
"Out of Resources",          //RETURN_OUT_OF_RESOURCES     =9 |MAX_BIT
"Volume Corrupt",            //RETURN_VOLUME_CORRUPTED     =10|MAX_BIT
"Volume Full",               //RETURN_VOLUME_FULL          =11|MAX_BIT
"No Media",                  //RETURN_NO_MEDIA             =12|MAX_BIT
"Media changed",             //RETURN_MEDIA_CHANGED        =13|MAX_BIT
"Not Found",                 //RETURN_NOT_FOUND            =14|MAX_BIT
"Access Denied",             //RETURN_ACCESS_DENIED        =15|MAX_BIT
"No Response",               //RETURN_NO_RESPONSE          =16|MAX_BIT
"No mapping",                //RETURN_NO_MAPPING           =17|MAX_BIT
"Time out",                  //RETURN_TIMEOUT              =18|MAX_BIT
"Not started",               //RETURN_NOT_STARTED          =19|MAX_BIT
"Already started",           //RETURN_ALREADY_STARTED      =20|MAX_BIT
"Aborted",                   //RETURN_ABORTED              =21|MAX_BIT
"ICMP Error",                //RETURN_ICMP_ERROR           =22|MAX_BIT
"TFTP Error",                //RETURN_TFTP_ERROR           =23|MAX_BIT
"Protocol Error"             //RETURN_PROTOCOL_ERROR       =24|MAX_BIT

So, rather than print out some difficult to decipher number like FFFFFFFF…etc., you get a friendly “Invalid Parameter” or “No Media” message in the debug output, saving you from having to convert a cryptic error number to a meaningful error message.

case ‘x’ and ‘X’

C99 defines ‘x’ and ‘X’ to be the same, except that the hex numbers [a-f] are printed in lowercase when ‘x’ is specified, and in uppercase when ‘X’ is specified.

In UEFI, the hex numbers [a-f] are always lowercase. Instead, the difference is that the ‘X’ specifier prefixes the number with a leading ‘0’ character.

case 'X':
    Flags |= PREFIX_ZERO;
case 'x':
    Flags |= RADIX_HEX;

Conclusion

Those are the differences as I see them. If I misunderstood something, let me know—leave a comment below. Thanks for reading!

Conversion Specifiers For Fun and Profit

Defined in C99:

Defined in UEFI:

Highlighting the Differences:

case ‘s’, case ‘S’, and case ‘a’

case ‘g’ and ‘G’

case ‘t’

case ‘r’

case ‘x’ and ‘X’

Conclusion

Post a Comment

popular

comments

archive

Conversion Specifiers For Fun and Profit

Defined in C99:

Defined in UEFI:

Highlighting the Differences:

case ‘s’, case ‘S’, and case ‘a’

case ‘g’ and ‘G’

case ‘t’

case ‘r’

case ‘x’ and ‘X’

Conclusion

Next

Newer Post

Previous

Older Post

Post a Comment