In this tutorial, we are going to learn about C floating point data types such as float, double.

float Data type:

In C, float data type occupies 4 bytes (32 bits) of memory to store real numbers that have at least one digit after the decimal point. A float data type can hold any value between 3.4E-38 to 3.4E+38.

The floating-point variable has a precision of 6 digits i.e., it uses 6 digits after the decimal point. Consider the following example that uses a float data type:

#include <stdio.h>
void main() {
     float a = 2.135; 
     float b = 7.217;
     float sum;
     sum = a + b;
     printf("Sum of two numbers = %f\n", sum);
}

Here, 4 bytes of memory is allocated to each variable a, b and they are initialized with floating-point constants 2.135000 and 7.217000 respectively.

A floating-point variable can represent a wider range of numbers than a fixed-point variable of the same bit width at the cost of precision. A signed 32-bit integer variable has a maximum value of 231 − 1 = 2,147,483,647, whereas an IEEE 754 32-bit base-2 floating-point variable has a maximum value of (2 − 2−23) × 2127 ≈ 3.4028235 × 1038.

All integers with 7 or fewer decimal digits, and any 2n for a whole number −149 ≤ n ≤ 127, can be converted exactly into an IEEE 754 single-precision floating-point value.

There’s no hardware support for unsigned floating-point operations. So, C doesn’t offer it.

double Data type:

In C, a double data type is used to increase the accuracy of the real number wherever a float is not sufficient.

double data type occupies 8 bytes (64 bits) of memory to store real numbers, which have at least one digit after the decimal point. A double data type can hold any value between 1.7E-308 to 1.7E+308.

double data type values have a precision of 14 digits i.e., they can have 14 digits after the decimal point. Consider the following example using a double data type.

#include <stdio.h>
void main() {
     double num1 = 26.7368;
     double num2 = 1.42924;
     double sum;
     sum = num1 + num2;
     printf("Sum of the numbers = %f\n", sum);
}

Here, 8 bytes of memory is allocated to each variable num1, num2 and they are initialized with real number constants 26.7368 and 1.42924 respectively.

To further extend the precision of a double data type, the user can use a long double data type. The long double type is guaranteed to have more bits than a double, while the exact number may vary from one hardware platform to another.

long double data type allocates 10 bytes (80 bits) of memory to store the given values. A long double data type can hold any value between 3.4E-4932 to 1.1E+4932.

In a 32 – bit forth implementation,doubke data type has a range of -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 (signed) or 0 to 18,446,744,073,709,551,615 (unsigned)

References:

Happy Learning 🙂