1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Development ANSI string vs char[]

Discussion in 'Software' started by Colonel Sanders, 14 Jun 2007.

  1. Colonel Sanders

    Colonel Sanders Minimodder

    Joined:
    25 Jun 2002
    Posts:
    1,210
    Likes Received:
    4
    I have noticed a tendency for two different types of 'string' variables in code. I can either "#include <string>" and then "string myString;" and that is what my college taught me to do. OR I can make a 'string' by using a line of code like "char mystring[20]" which creates an array of characters that can hold a 'string' up to 20 characters long - that means I have to know the size of the string when I create it. :( With the "string myString;" line of code, iI haven't noticed any size of the variable that is created.

    But what causes a problem, when I try to use the "string myString" style declaration with other functions, such as opening a file, I have to put something like "ifstream.open(myString.c_str());" because when fstream it requests a "const char*" (I think thats right, I could be wrong). My point is, it seems like the "string" type is incompatible with a lot of functions?

    Does the ANSI string declare something different than simply an array of character values? Is the "string" type unicode, ASCII or what?

    L J
     
  2. DougEdey

    DougEdey I pwn all your storage

    Joined:
    5 Jul 2005
    Posts:
    13,933
    Likes Received:
    33
    the ansii string (II!) is a class I believe, it has to be to have "." functions.

    Based off an ANSII template and I believe it uses a vector, I would check the header file and it'll have the details in there. I can't check because I'm at work.
     
  3. acron^

    acron^ ePeen++;

    Joined:
    15 Oct 2001
    Posts:
    2,938
    Likes Received:
    10
    Yeah, I've had the chance to use both char and wchar_t, and STL strings pretty extensively in my work so far and once you get confident with them, I'd encourage using an amalgam of both the type and the class.

    For example, if you're not performing manipulations on the string, there really is no reason to use a string.

    Code:
    const char text[20];
    ...works just fine.

    When it comes to manipulations, it can save a bit of effort.
    Rather than the old sprintf and argument method, you can just use
    operators:

    Without string:
    Code:
    char buffer[20];
    char text1[8] = "Hello";
    char text2[8] = "World";
    
    sprintf(buffer, "%s, %s!", text1, text2);
    
    With string:
    Code:
    string text1 = "Hello, ";
    char text2[8] = "World!";
    
    text1 += text2;
    
    I believe that the reason string is unsupported in native functions is simply that it's a class and not a type. Some people I've come across won't go anywhere near STL strings and for reasons like that, you've got to remain as grounded and as unbias as possible I guess...
     
  4. Colonel Sanders

    Colonel Sanders Minimodder

    Joined:
    25 Jun 2002
    Posts:
    1,210
    Likes Received:
    4
    Does the string support ASCII/unicode or what? What about the various types of encoding, such as ASCII, MBCS or DBCS?

    BTW, while on the subject - I am confused about MBCS and DBCS.

    For example, according to this guide, here is an example of multi-byte encoding:

    Code:
    43 3A 5C 83 88 83 45 83 52 83 5C 00
             LB TB LB TB LB TB LB TB
    C  :  \  some weird characters here
    LB = lead byte, TB = tail byte. What is odd here is the fact that every lead-byte has the same hex value? ? That looks like the "83" tells the PC "this one is unicode" however, I thought the purpose of double-byte unicode was to use both bytes to represent a billion (yeah, I know thats not accurate) characters? If you have to place "83" then the character that seems like quite a waste? What do I not understand?

    L J
     
  5. DougEdey

    DougEdey I pwn all your storage

    Joined:
    5 Jul 2005
    Posts:
    13,933
    Likes Received:
    33
    The lead-byte may be telling it the character set. Since those "weird characters" as you put it, are actually Katakana (the Japanese character set used for "imported" words.

    Katakana has 30 characters I believe and no upper or lower case, so it would make sense to allow it to have a specific range. It probably envelopes the hirigana character set as well.
     
  6. Colonel Sanders

    Colonel Sanders Minimodder

    Joined:
    25 Jun 2002
    Posts:
    1,210
    Likes Received:
    4
    Em, another question, why does it seem like nearly every string (non-ANSI) is a "CONST char []"? I thought a const variable simply meant that it could not be changed, or do I need to know more about the const keyword?

    L J
     
  7. acron^

    acron^ ePeen++;

    Joined:
    15 Oct 2001
    Posts:
    2,938
    Likes Received:
    10
    In answer to your previous question, use std::string for mbs or std::wstring for wcs.
    Both use the same set of functions.

    At a guess, as for the majority of times explicitly defined strings won't be manipulated, using "const char*" is just better practice.
     
  8. Colonel Sanders

    Colonel Sanders Minimodder

    Joined:
    25 Jun 2002
    Posts:
    1,210
    Likes Received:
    4
    So em, is a const char* style hard-coded to a specific value, or does that mean that the value will be set once at runtime and then no other code will modify the code?

    For instance, if I declare "const int maxsive 300;" that means I have a hard-coded integer named maxsize and equal to 300. However, if I were to say "const int maxsize = someVariable" then would my program create an un-modifiable variable equal to the value someVariable, and the value of someVariable could vary with each run of the application?

    L J
     
  9. acron^

    acron^ ePeen++;

    Joined:
    15 Oct 2001
    Posts:
    2,938
    Likes Received:
    10
    I'm not actually sure on that one.

    There's no reason you couldn't type "const int maxsize = someVariable". It'd work perfectly, but you might have issues with scope. I've never tried declaring a const data member and then trying to define it in a constructor though, so I'm not sure if that would work.

    Give it a try ;)
     
  10. Flax

    Flax What's a Dremel?

    Joined:
    24 Jun 2002
    Posts:
    300
    Likes Received:
    1
    The effect of using const is that you create a variable that cannot be changed from its initial value. This is consistant where ever you can put the const keyword.

    The code the compiler actually produces when you create a constant variable depends on the context.

    The compiler will probably put the memory allocated to a global constant into the readonly data segment of the program, constants with local scope will be allocated on the stack when they come into existance during execution.

    So "const int maxsize = 300;" will be put into the readonly section (or possibly optimised out by the compiler, depending on how it's used), and "const int maxsize = someVar;" will be created on the stack.

    In the case where the variable is created on the stack the compiler will output the same (assembly)code as if it was non-const, the only difference is that it will detect any places where code you have written would modify the contents of the variable, and fail with an error to let you know.
    The significance of a variable being in the readonly section is that the operating system will not allow you to write to it.

    C++ has a keyword that allows you to change the const-ness of a variable, you can see the difference between the two types of const using this.

    Code:
    #include <iostream>
    
    const int global_const = 10;
    
    int main(int argc, char** argv)
    {
    	const int local_const = argc;
    	
    	std::cout << "local_const = " << local_const << std::endl;
    	const_cast<int*>(&local_const)[0] = 13;
    	std::cout << "local_const = " << local_const << std::endl;
    	
    	std::getchar();
    	
    	std::cout << "global_const = " << global_const << std::endl;
    	const_cast<int*>(&global_const)[0] = 13;
    	std::cout << "global_const = " << global_const << std::endl;
    	
    	return 0;
    }
    
    
    The first half of the program should work fine and change the value of supposedly constant local_const, the second half will crash with an unhandled Access Violation.

    As an kind-of-related note, if you haven't come across it before the syntax for creating constant pointers is a little odd, you read them backwards.
    So to declare a pointer you can't change, to an int you can you'd write "int* const varName;" and a constant pointer to a constant int would be "const int* const varName;"
     

Share This Page