In UTF-8 the first byte of a specific character will decide the number of total bytes contained by that character, a byte with a bit sequence of 10XX_XXXX is a continuation byte and is therefore invalid as a first byte character.
For example, Unicode code points are written in hexadecimal notation with a Unicode prefixing, U+XXXX, informing a compiler that the proceeding hexadecimal string is to be taken as a Unicode code point and not just a standard ASCII character. In the proceeding passage I will place the binary representations of each hexadecimal code space in parentheses with an X being any binary value (either a 1 or 0,) but these are just for demonstrating what the actual character looks like in binary.
U+00 - U+7F (0XXX_XXXX): This is the standard 128-bit ASCII characters set.
U+80 - U+07FF (110X_XXXX 10XX_XXXX): This is the second code space in UTF-8.
U+0800 - U+FFFF (1110_XXXX 10XX_XXXX 10XX_XXXX): This is the third code space in UTF-8.
U+010000 - U+10FFFF (1111_0XXX 10XX_XXXX 10XX_XXXX 10XX_XXXX): This is the fourth code space in UTF-8.
Therefore, to count the number of actual characters in a UTF-8 character string, it could, theoretically, be as simple as initialize an accumulator to zero, and a pointer to the address of the string, and then proceed to check whether the MSB of each byte are zero. If they are increment both the accumulator and the pointer by one. If they are not, figure out where the first zero occurs, counting from the MSB, within that byte. After which, verify that the number of continuation bytes proceeding the current byte plus the current byte itself actually match the number stated by that initial byte. If the count matches, then increment the pointer by that number but only increment the accumulator by one. If it doesn't then the entire string is invalid and throw some kind of error, do not ever try and programmatically try and fix some invalid string. It'll never work... At the conclusion of receiving the null byte, the number of actual UTF-8 characters present within the given string would then be inside the accumulator.
Edit 1: Keep in mind that UTF-8 has sequences of single byte chars, hence the 8 in UTF-8, in contrast UTF-16, used by variables of type wchar, have a sequence of word size characters, or shorts, hence the UTF-16.
Edit 2: Also, a buffer is, well, a buffer. They don't contain exact 100% accurate sizes. They contain some arbitrary size greater than the actual size. So technically, your question is actually asking, "What size did I set my buffer to?"
Edit 3: If we state that UTF-8 is suffixed with an 8 due to the reason being that the maximum size of the initial byte making up an actual Unicode code point character is 8-bit, or for example, the binary bit string 0XXX_XXXX, or the binary bit string 110X_XXXX, are all 8-bit. Given, for the second byte stated, we now know that it must be followed by a single continuation byte, 10XX_XXXX, in order to actually point to a valid Unicode code point; we only know this though because we first scanned those initial 8 bits of the entire UTF-8 character. It's actually not possible to reliably scan an entire UTF-8 document using a pointer of any size larger than 8 bits, considering we have characters of some arbitrary number of bytes. The first character might be a length of a single byte, the next having a length of two bytes, and then one with a length of three. The key idea here though, is that each of these characters all have an initial 8-bit byte that informs us on which Unicode Code Space our current character is actually pointing to, and these initial bytes are all 8-bit in size.
That previous statement implies that the UTF-16 encoding must then be suffixed with a 16 due to the reason that its initial byte that informs us on which Unicode Code Space our current character is pointing to is of length 16 bits. In fact, with UTF-16, it's actually possible to jump straight into a string at some arbitrarily chosen location and know exactly what character you're looking at, this isn't exactly possible with UTF-8, considering 8 bits only have a cardinality of 256 characters, while 16 bits have a cardinality of 65,535 characters.
Hope this all helped, I absolutely love all things encoding. I would definitely check out the Unicode website to find out exactly what a Unicode Code Space is, and which characters belong to which code space, and which set of characters are actually invalid, as Unicode doesn't exactly run continuously by assigning every code point in order starting with 0; it instead jumps all over the place based on some given category. UTF-8 and UTF-16 both point to the same Unicode code point but are just encoded differently. One by sequences of 8 bits and the other by sequences of 16 bits.