After the rust string overview of its internal substructures, let's see if c++ QString storage is more light, but first we'r going to take a look to the c++ standard string object:
At first sight we can see the allocation and deallocation created by the clang++ compiler, and the DAT_00400d34 is the string.
If we use same algorithm than the rust code but in c++:
We have a different decompilation layout. Note that the Ghidra scans very fast the c++ binaries, and with rust binaries gets crazy for a while.
Locating main is also very simple in a c++ compiled binary, indeed is more low-level than rust.
The byte array is initialized with a simply move instruction:
00400c4b 48 b8 68 MOV RAX,0x6f77206f6c6c6568
And basic_string generates the string, in the case of rust this was carazy endless set of calls, detected by ghidra as a runtime, but nevertheless the basic_string is an external imported function not included on the binary.
(gdb) x/x 0x7fffffffe1d0
0x7fffffffe1d0: 0xffffe1e0 low str ptr
0x7fffffffe1d4: 0x00007fff hight str ptr
0x7fffffffe1d8: 0x0000000b sz
0x7fffffffe1dc: 0x00000000
0x7fffffffe1e0: 0x6c6c6568 "hello world"
0x7fffffffe1e4: 0x6f77206f
0x7fffffffe1e8: 0x00646c72
0x7fffffffe1ec: 0x00000000 null terminated
(gdb) x/s 0x7fffffffe1e0
0x7fffffffe1e0: "hello world"
auto s = string(cstr);
string s2 = "test";
Clang puts toguether both stack strings:
[ptr1][sz1][string1][null][string2][null][ptr2][sz2]
C++ QString datatype
Let's see the great and featured QString object defined on qstring.cpp and qstring.hSome QString methods use the QCharRef class whose definition is below:
class Q_EXPORT QCharRef {
friend class QString;
QString& s;
uint p;
Searching for the properties on the QString class I've realized that one improvement that rust and golang does is the separation from properties and methods, so in the large QString class the methods are hidden among the hundreds of methods, but basically the storage is a QStringData *;After removing the methods of QStringData class definition we have this:
struct Q_EXPORT QStringData : public QShared {
QChar *unicode;
char *ascii;
#ifdef Q_OS_MAC9
uint len;
#else
uint len : 30;
#endif
uint issimpletext : 1;
#ifdef Q_OS_MAC9
uint maxl;
#else
uint maxl : 30;
#endif
uint islatin1 : 1;
private:
#endif
};
So two pointers and the unsigned integer length.
Regarding QChar it helds two ushorts:
private:
ushort ucs;
and a set of enums for decribing the type of byte.
ucs is the two bytes unicode considering latin1 only one byte, if the ushort is > 0xff it return 0.
we have some quite nice methods implemented on the QChar class to classify the byte stored.
But lets see the real storage of a QString object:
GCC and doesn't recognize well the object, the print and display are the incomplete.
Kdbg display sort of the substructures with the text length:
In the practice the QString is an object that helds the two dwords pointer to the metadata, then a pointer to the end, and then the null terminated string text.
The pointer 0x55555576f6a0 points to the metadata:
Here we have some metadata, like the string size, type of data, and so on.
So ... is it possible to overlap the metadata of a string and perform an QString overflow? the answer is yes if the previous string is a controlled char * then we can modify the metadata pointer to our buffer.
Binary size benchmark:
Some curious stats about the binary file size, which are not conclusive because some standard libs are include on the binary.
Binary file length:
Turbopascal 2,928
c++ string 21,256
Free pascal 32,256
c++ QString 28,0496
golang 2,020,112
rust 2,625,776
The hello world string rust is about 2,625,776 bytes, the c++ one is 21,064 bytes
Comentarios