In this tutorial, we are going to learn about Comment Lines & Tokens in a C program.

Any C program usually consists of multiple statements. Each statement is composed of one or more of the three given below:

  1. Comments
  2. Whitespace characters
  3. Tokens

Understanding Comments in C:

As mentioned earlier, a computer program is a collection of instructions or statements. In a computer program, a comment is used to mark a section of code as non-executable.

Comments are mainly used for two purposes:

  1. To mark a section of executable code as non-executable, so that the compiler ignores it during compilation.
  2. To provide remarks (or explanation) on the working of the given section of code in plain English, so that other programmers can read and understand the code.

In C language, there are two types of comments:

  1. end-of-lin comment: It starts with //. The content that follows the  // and continues till the end of that line is a comment. It is also called a single-line comment.
  2. traditional comment: It starts with /* and ends with */. The content between /* and */ is the comment. It is also called multi-line comment.

The code given below shows the two types of comments:

/*The C Programming Language is a general-purpose, high-level language that was originally developed by Dennis M. Ritchie in 1972 at AT& T (American Telephonic &  Telegraphic) Bell Laboratories, New Jersey. It was mainly influenced by the languages ‘BPCL’ and ‘B’. –>this is an example of traditional comment*/

#include <stdio.h>
int main() {
    printf("Hello! World");
    return 0;
} //end of the main() function - this is an example of a end-of-line comment

Given below are 3 important points regarding comments:

  1. There should not be any space between the forward slashes in //, i.e., /  / is incorrect. Similarly, there should not be any space between the slash and star characters in /* and */, i.e., /* and */ are incorrect.
  2. Comments do not nest, i.e., /* and */ comment has no special meaning inside a // comment. Similarly, a // comment has no special meaning inside a /* comment.
  3. One should not write comments inside character literals (i.e., characters enclosed between single-quotes). Comments inside String literals (i.e., the text enclosed between double-quotes) are treated as part of the String’s content.

Whitespace Characters and Tokens in C:

Whitespace Characters:

In English, we use to separate two words. When it comes to typing text on a computer, there are different types of characters that are used to separate text by creating space. These are called whitespace characters.

The different whitespace characters in are:

  1. Space ‘ ‘ –  (ASCII SP) produced by pressing the spacebar.
  2. Tab ‘\t’ – (ASCII HT) produced by pressing the tab key.
  3. From Feed character ‘\f’ – (ASCII FF) usually used as per the page separator char between lines or paragraphs.
  4. Line Termination chars (used to separate two lines) – produced by pressing the Enter key.
  • Line Feed – ‘\n’ (ASCII LF also called NL – New Line) – used in all Unix and  Mac OS X systems.
  • Carriage Return – ‘\r’ (ASCII CR) – used in MAC OS 9 and below
  • Carriage Return followed by the Line Feed – ‘\r\n’ (ASCII CRLF) – used in Windows systems.

Tokens in C:

The basic building blocks used to write a C program are called Tokens.

Consider the following example:

#include <stdio.h>
void main() {
     int a = 5, b = 10;
     printf("Hello! World");
     pintf("Sum of two numbers = %d", a + b);
} 

Here, in the above code, individual fragments like void, main, {, etc…. are different types of tokens.
In C language, tokens are classified into six categories. They are mentioned below:

  1. Identifiers – These are simple names used to refer to or identify something. For example, names of variables, functions are called Identifiers. In the above code, main, a, printf are Identifiers.
  2. Keywords – These are one of the 32 reserved words like int, for, if, etc…… These words have special meaning when used as a part of the program.
  3. Constants – These are fixed values like 5, 10, etc.. which are used in a program.
  4. String Constants – These are specified within double quotations. For example, in the above code “Hello! World” is a string constant.
  5. Separators – The following are called as separators – (   )   {   }   [   ]   ;   ,   .
  6. Operators – The following are called as operators +, -, *, /, =, >, <, >=, <= etc…

Understanding Escape Sequences in character & string literals:

In C, the backslash character \ is used to make an escape sequence. An Escape Sequence is an escape character \ followed by a normal character. For example, \n or \t.

The presence of the escape sequence character changes the meaning of the character which follows it. For example, when the string literal “Hello\tWorld” is  printed, the result is seen as

Hello     World

In the string literal “Hello\tWorld“,  \t represents the TAB character. Similarly, if we want to print a double quote inside a double-quoted string literal, we need to escape the double quote by using the escape character \. For example,

printf("Hello \" ("World));

The above code will produce the following output:

Hello " (World)

Few points regarding escape sequences are given below:

Escape Sequence and their ASCII codes

  • Each sequence has a unique ASCII value.
  • Each and every combination of an escape sequence starts with the backslash \.
  • Although an escape sequence consists of two characters, it represents a single special character in the given context.

References:

Happy Learning 🙂